W3C home > Mailing lists > Public > whatwg@whatwg.org > April 2005

[whatwg] [html5] tags, elements and generated DOM

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 6 Apr 2005 03:07:51 +0000 (UTC)
Message-ID: <Pine.LNX.4.61.0504060257560.27724@dhalsim.dreamhost.com>
On Wed, 6 Apr 2005, Lachlan Hunt wrote:
> > > >
> > > > The <body> will always be implied, though.
> > > 
> > > Not in a conforming SGML parser...
> > 
> > Yeah, I meant in browsers, not per SGML.
> 
> Ok, fair enough.  But can you explain why Opera doesn't when in 
> standards- compliant mode, as I explained in my previous e-mail.  Is it 
> a bug or intentional?

Bug.


> Ok, if the spec is going to address this, then I think it should say 
> something like:
> 
>   "If a required element with an optional start-tag is entirely missing
>    from the document, a user agent *may* imply it and include it within
>    the DOM.  Missing elements with required start-tags *must not* be
>    automatically implied.
> 
>   "Note: It is common for existing user agents to automatically imply
>    both the head and body elements, even when those sections are omitted
>    entirely from the document markup."

I'll investigate this in more detail when I write the section on how to 
parse HTML. Backwards-compatibility with the common subset of what is 
actually implemented is my top priority though.


> I used "may", because if "must" or "should" were used instead, it may 
> conflict with anything the SGML spec says on the matter and it would 
> make OpenSP, and thus the validator, non-conformant.  I would stick with 
> "may" because, as I showed previously, existing UAs don't do the same 
> for <tbody>.

OpenSP is already non-conformant to HTML5. See:

   http://whatwg.org/specs/web-apps/current-work/#conformance

In any case, assuming I'm still the editor when the parsing section gets 
written, HTML5 will most likely stop the pretense of HTML being an SGML 
application.


> Also, while on the topic of handling invalid documents, is this spec 
> going to attempt to address the <x><y></x></y> problem?

Probably not, as there is no generally accepted solution. In fact there is 
no known solution (to my knowledge) that is entirely satisfactory.


> > since an HTML4 document without a <title> is invalid and thus parsing 
> > is undefined in HTML4.
> 
> Is it not defined by SGML either?  I really must get a copy of 
> Goldfarb's SGML Handbook later and check for sure.

SGML doesn't define error handling rules either as far as I recall from 
the last time I read Goldfarb. But either way, HTML4 overrides SGML in 
several places and explicitly states that handling of invalid HTML 
documents is undefined and UA-dependent. (Well, actually, it's about as 
vague about this as about everything else. But relative to how explicit it 
is about everything else, it's pretty clear about this.)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 5 April 2005 20:07:51 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:22 UTC