Re: HTML output method: indenting limitations

Kay, Michael wrote:
> I think I would like to move (as we have done recently with the XML
> indenting rules) to a more prescriptive statement of exactly where an HTML
> indenter may add whitespace, rather than describing it in terms of the
> effect on visual appearance.

That sounds good.
 
> The rules, I think, are:
> 
> (a) whitespace must only be added before or after an element, or adjacent to
> an existing whitespace character
> 
> (b) whitespace must not be added adjacent to an inline element, the inline
> elements being those included in the %inline category in the HTML DTD
> 
> (c) whitespace must not be added inside a formatted element, the formatted
> elements being pre, script, style, and textarea

Those are the rules we're following in 4Suite, and they seem to result in
safely indented output. However, there's one exception in our implementation: 
since Netscape 4.x wrongly fails to ignore certain instances of whitespace
appearing between td and tr tags, we have added 'TD' to the set of inline
elments.

I doubt you really want to mention that exception in the spec, but I was
thinking it would be nice to have an annotated version of the spec(s) for
implementers and users alike, so people could share knowledge like this. Tim
Bray's annotated XML spec [1] was, and still is, a huge help to me -- and
others, I'm sure. I am thinking that a more modern version would work like the
very successful PHP docs [2]: split up the spec into separate sections, each
with its own little moderated message board below it.

> and as you point out, "whitespace" in the above should mean "white space as
> defined in HTML 4.01 section 9.1.

Hooray! Now I can add all those zero-width spaces and form feeds ;)

Mike

   [1] http://www.xml.com/axml/testaxml.htm
   [2] http://www.php.net/manual/en/

-- 
  Mike J. Brown   |  http://skew.org/~mike/resume/
  Denver, CO, USA |  http://skew.org/xml/

Received on Thursday, 13 February 2003 14:05:13 UTC