Re: Void elements in HTML (Was: ZIP-based packages and URI references into them ODF proposal) from Robert J Burns on 2008-12-31 (public-html@w3.org from December 2008)

From: Robert J Burns <rob@robburns.com>
Date: Wed, 31 Dec 2008 17:54:42 -0600
To: Philip Taylor <pjt47@cam.ac.uk>
Cc: Adam Barth <w3c@adambarth.com>, Jonas Sicking <jonas@sicking.cc>, public-html@w3.org, Julian Reschke <julian.reschke@gmx.de>
Message-Id: <8B768945-B652-4507-BC1D-6300FEECEE81@robburns.com>

Hi Philip, julian, Adam and Jonas,

What I have suggested in the past is that HTML5 provide a consistent  
document conformance requirement that all void elements use the "/"  
and that all non-void elements prohibit the slash. Therefore <div/>  
would be document non-conforming even if the author intended <div></ 
div>. <img> would be document non-conforming, though alerts from a  
conformance checker could be suppressed since the point of this is to  
provide authors with a consistent authoring approach.

In terms of implementation conformance and parsing, parsing would  
still be done the same for known elements. In other words <a/> would  
be parsed the same as <a>. However, <e/> (as an unknown element) would  
be parsed as a void element while <e> would not.

This does introduce a somewhat confusing inconsistency with XML's use  
of the /, but that is not an inconsistency I think should concern us  
much. The point is that it is important for authors to understand the  
difference between a void and a non-void element and such an approach  
reinforces that understanding. It is less important to provide an  
empty element minimization as XML does. In terms of authoring, I think  
this would be a very consistent and easy to understand approach that  
only requires authors to fully understand which elements are void  
(those whose tag ends with a slash) and which are not (those whose tag  
does not end with a slash, but instead has a separate closing tag).

Take care,
Rob

On Dec 31, 2008, at 2:22 PM, Philip Taylor wrote:

>
> Adam Barth wrote:
>> On Tue, Dec 30, 2008 at 4:57 PM, Philip Taylor <pjt47@cam.ac.uk>  
>> wrote:
>>> http://www.haliburtonrealestate.on.ca/ -- <li><a href="http://www.mls.ca 
>>> "
>>> target="_blank" title="Multiple Listing Service" />MLS</a>
>>>
>>> http://www.ccitula.ru/ -- <a href="pages/virtv.htm"/> <img
>>> src=http://www.ruschamber.net/banner/VEru158x50.jpg border=0></a>
>>>
>>> http://takasago.shop-pro.jp/ -- <a href="?pid=1912944" /><img
>>> src="http://img05.shop-pro.jp/PA01015/854/product/1912944_th.jpg"
>>> class="border" /></a>
>>>
>>> http://www.alternativegreetingcards.com/ -- <a href="products.asp? 
>>> id=57"
>>> class="submenu" />Wizard of Oz</a>
>> I'm not an HTML parsing expert, but these examples seem as easy to  
>> fix
>> up as other parsing oddities like mis-nested tags (e.g.,
>> <b>foo<i>bar</b>baz</i>).  Why can't the parser assume that <foo />  
>> is
>> a void element until it finds a </foo> that would otherwise close the
>> tag?  This would permit forward and backward compatible parsing and
>> ease of authoring.
>
> If that rule was introduced, it should apply to all (non-void)  
> elements, otherwise it would be introducing more self- 
> inconsistencies in the language. But then it would cause problems  
> and/or confusion in cases like:
>
>  <div class="a"><div class="b" />Is this text in the b div?</div>
>
>  <script src="..." /><script>alert('Does this get executed?')</script>
>
>  <textarea /><script>alert('Does this get executed?')</script></ 
> textarea>
>
> Also it would have to interact with other aspects of error handling,  
> like:
>
>  <table><tr><td />Does this text get foster-parented, and then  
> somehow get yanked back into the table after reaching the end tag?</ 
> td>
>
>  <a href="..." />Does the first a element get closed by an implicit  
> end tag before the second a element?<a href="...">or is this nested  
> inside the first a element?</a></a>
>
> -- 
> Philip Taylor
> pjt47@cam.ac.uk
>

Received on Wednesday, 31 December 2008 23:55:24 UTC