[whatwg] [html5] tags, elements and generated DOM

Ian Hickson wrote:
> On Tue, 5 Apr 2005, Anne van Kesteren wrote:
>> <script type="text/javascript" src="bar"></script>
>> <title>Foo</title>
>>
>>..?
> 
> If I am not mistaken:
> 
>    <html><head><script.../>
>    <title.../></head><body></body></html>

I believe you are mistaken.  A conforming SGML parser will not imply the 
body element without any content to make it do so.

>>Is there a BODY element in this document (or, is there always a body 
>>element?):
>>
>> <style type="text/css">
>>  body{ background:lime }
>> </style>
>>
>>... or this:
>>
>> <title>Bar</title>
> 
> The <body> will always be implied, though.

Not in a conforming SGML parser, though it seems to be in Mozilla, Opera 
and IE, as I checked using your DOM viewer [1].  Although Opera seems to 
have a bug in standards comliant mode (at least, according to the DOM 
viewer script) because neither the head or body elements appeared in the 
DOM using this markup:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
     "http://www.w3.org/TR/html4/strict.dtd">
<title>Foo</title>
<script type="text/javascript" src="bar"></script>

However, if the <body> element were to be automatically implied 
regardless, then the same would be true of the <tbody> element since 
both are required elements of <html> and <table>, respectively, and both 
have optional start- and end-tags,the rules for both must be the same. 
Neither Mozilla or Opera implies the missing tbody element within 
<table></table>, although IE does.  However, OpenSP does not imply the 
missing elements in either case.

The only documentation I could find that supports this, given the short 
amount of time I have to look, is this paragraph from section 9.2.3 of 
Martin Bryan's SGML and HTML Explained [2] that was explaining how the 
associated example should be parsed.

| The start-tag can be omitted because the absence of this compulsory
| first embedded subelement could be implied by the parser from the
| content model... As soon as it sees a character other than a
| start-tag delimiter (<) it will recognize that the character should be
| preceded by [the start tag].

> (For backwards compatibility with legacy parsers, the <head> probably won't be.)

The head element seems to be implied by Mozilla and IE.  Opera and 
OpenSP correctly don't imply the missing head element.

[1] http://www.hixie.ch/tests/adhoc/html/parsing/compat/viewer.html
[2] http://www.is-thought.co.uk/book/sgml-9.htm#Omitting
-- 
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/     Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox

Received on Tuesday, 5 April 2005 09:01:07 UTC