[whatwg] [html5] tags, elements and generated DOM

Ian Hickson wrote:
> On Wed, 6 Apr 2005, Lachlan Hunt wrote:
>>>The <body> will always be implied, though.
>>
>>Not in a conforming SGML parser...
> 
> Yeah, I meant in browsers, not per SGML.

Ok, fair enough.  But can you explain why Opera doesn't when in 
standards-compliant mode, as I explained in my previous e-mail.  Is it a 
bug or intentional?

> According to the HTML spec, the 
> handling of the above is completely undefined since it is invalid. (Note 
> that something being invalid or non-conformant does _not_ make the 
> rendering undefined in most cases in Web Apps 1 / HTML5. That's one of the 
> main things I'm making sure of.)

Ok, if the spec is going to address this, then I think it should say 
something like:

   "If a required element with an optional start-tag is entirely missing
    from the document, a user agent *may* imply it and include it within
    the DOM.  Missing elements with required start-tags *must not* be
    automatically implied.

   "Note: It is common for existing user agents to automatically imply
    both the head and body elements, even when those sections are omitted
    entirely from the document markup."

I used "may", because if "must" or "should" were used instead, it may 
conflict with anything the SGML spec says on the matter and it would 
make OpenSP, and thus the validator, non-conformant.  I would stick with 
"may" because, as I showed previously, existing UAs don't do the same 
for <tbody>.

I included the part about start-tags because elements like <li> (which 
require a start-tag) do not be implied by existing UAs when they are 
missing.

Also, while on the topic of handling invalid documents, is this spec 
going to attempt to address the <x><y></x></y> problem?

>>However, if the <body> element were to be automatically implied 
>>regardless, then the same would be true of the <tbody> element...
>>Neither Mozilla or Opera implies the missing tbody element within 
>><table></table>, although IE does. However, OpenSP does not imply the 
>>missing elements in either case.
> 
> <tbody> is implied if there is a <tr> there.

Yes, exactly, just like <body> is implied if there is a <p>, <div>, or 
other element/content there; but not if there isn't.

>>Opera and OpenSP correctly don't imply the missing head element.
> 
> I'm not sure what you mean by "correctly" here

Well, I read somewhere that OpenSP is "the reference implementation" of 
SGML, so I assumed that means what it does is correct.  In this case, 
Opera showed the same behaviour so I called it correct as well. 
However, if this behaviour is not defined in SGML at all, then I should 
not have said "correctly" either.

> since an HTML4 document without a <title> is invalid and thus parsing
> is undefined in HTML4.

Is it not defined by SGML either?  I really must get a copy of 
Goldfarb's SGML Handbook later and check for sure.

> If there is a <title> then the <head> must be implied per SGML.

Agreed.

-- 
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/     Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox

Received on Tuesday, 5 April 2005 19:49:40 UTC