Re: 3.1 Introduction (Draft), review of

Robert Burns wrote:
> First, I think there's a danger of going into too much detail  
> regarding optional tags. The only things I think might need to be in  
> an introductory section (maybe) are:
> 
> 1) that empty elements must have their closing tag omitted unless an  
> author uses the xml-style self-closing tag (e.g., <link />).
> 2) that empty elements must be closed when using the xml  
> serialization: i..e., either (<link></link> or <link />)
> 
> So to avoid this confusion and simplify things, it may make sense to  
> always recommend (or as far as this introduction goes, just gloss- 
> over the difference so that authors use) the self-closing tag for  
> empty elements.

Teaching authors about XML-style self-closing tags is also a cause of 
confusion.

It's fairly easy to find examples, many of which are likely to result in 
unexpected DOMs after parsing:

http://encarta.msn.com/encyclopedia_761579147/William_I_(of_England).html
     <div style="clear:left" /><div style="clear:left" />

http://www.rollingstone.com/reviews/movie/5948073/forrest_gump
     <div style="clear:both;" />

http://money.aol.com/savings
     <div class="module colorFive" /><div class="header" /><h3>What Will 
My Savings Be Worth?</h3></div>

http://www.nmrestaurants.org/
     <span id="a1"><span id="a1" /></span>

http://www.princeton.edu/Siteware/Admissions.shtml
     <a id="startContent"><span /></a>

http://www.bible.org/series.php?series_id=72
     <option value="Gen" />Genesis

http://www.challenge.nl/
     <p />

http://www.alphanet.ch/
     <p />

http://www.paramotoraustralia.com/shop/
     <place /><placename /><span lang="EN-AU">Byron</span></placename />

All of those are served as text/html, and I don't think any were 
anywhere near being well-formed XML. (I'm not sure what fraction of 
pages have errors like that - the statistics get strongly distorted by 
Microsoft Office 'HTML' documents.)

Cases like <script src="..." /> are particularly nasty - see the 
discussion around http://krijnhoetmer.nl/irc-logs/html-wg/20070524#l-24

Almost nobody (relative to the total population of authors) does/will 
use the XML serialisation of HTML4/5, and I would expect anybody who 
uses it is already able to understand XML self-closing tags without 
needing an HTML 5 introduction to tell them about it, particularly since 
their XML tools will notify them whenever they make a mistake.

For HTML-serialisation authors, in either case (with-slash vs 
without-slash) you would have to remember which elements have an empty 
content model if you wanted to write correct code; but the with-slash 
suggestion causes some confusion with XML and encourages people to 
erroneously use slashes for elements that have empty content but not an 
empty content model, which leads in some cases to their code not working 
like they expected it to.

The failure case when teaching someone that they must not use an end tag 
for certain elements is that they will use one anyway, and write "<link 
...></link>" or "<embed ...></embed>", which is harmless.

Teaching that the slash is optional, and that it is permitted on certain 
elements so that it's kind of like XML even though you actually can't 
use it everywhere like you can in XML, seems like needless complexity, 
compared to simply saying that some elements do not have end tags.

(I think a table that lists for each element whether the start and end 
tags are required/optional/forbidden, like the Elements appendix in 
HTML4, would be the most effective way to tell authors which tags are 
optional - the wording in the current HTML 5 spec is too spread out and 
precise and hard to follow, and unsuitable for authors who just want a 
quick reference guide. I'm unsure where such a table should be put, though.)

-- 
Philip Taylor
philip@zaynar.demon.co.uk

Received on Tuesday, 17 July 2007 02:46:13 UTC