Re: Void elements in HTML (Was: ZIP-based packages and URI references into them ODF proposal) from Robert J Burns on 2009-01-01 (public-html@w3.org from January 2009)

From: Robert J Burns <rob@robburns.com>
Date: Thu, 1 Jan 2009 10:53:26 -0600
To: Philip Taylor <pjt47@cam.ac.uk>, Adam Barth <w3c@adambarth.com>, Jonas Sicking <jonas@sicking.cc>, HTML WG <public-html@w3.org>, Julian Reschke <julian.reschke@gmx.de>
Message-Id: <DBE0F9BB-EC82-45A9-B461-FEB7723A116D@robburns.com>
A few more points to add about this suggestion of using the slash to  
always indicate a void element in document conforming HTML (e.g.,  
<img /> or <e />.

1) it is completely compatible with XML (it just doesn't permit  
element minimization for self-closing non-void empty elements)
2) it is completely consistent for document coformance in that authors  
always use the slash for void elements and never use the slash for non- 
void elements
3) for implementation conformance, all unknown elements with a slash  
are parsed as void elements and all unknown elements without a slash  
are parsed as non-void elements
4) the existing void and non-void elements already parsed in current  
UAs (prior to HTML5) are all excepted from the slash rule which means  
exceptions only apply to those existing elements and only in the  
implementation conformance criteria for legacy support (not in the  
document conformance criteria)
5) this rule can be followed by authors whether creating HTML as the  
text/html or XML serializations so this ensures strong consistency in  
terms of document conformance (and let's face it we can't really claim  
there's much consistency anywhere in the implementation conformance  
criteria)

Take care,
Rob



On Dec 31, 2008, at 5:54 PM, Robert J Burns wrote:

>
> Hi Philip, julian, Adam and Jonas,
>
> What I have suggested in the past is that HTML5 provide a consistent  
> document conformance requirement that all void elements use the "/"  
> and that all non-void elements prohibit the slash. Therefore <div/>  
> would be document non-conforming even if the author intended <div></ 
> div>. <img> would be document non-conforming, though alerts from a  
> conformance checker could be suppressed since the point of this is  
> to provide authors with a consistent authoring approach.
>
> In terms of implementation conformance and parsing, parsing would  
> still be done the same for known elements. In other words <a/> would  
> be parsed the same as <a>. However, <e/> (as an unknown element)  
> would be parsed as a void element while <e> would not.
>
> This does introduce a somewhat confusing inconsistency with XML's  
> use of the /, but that is not an inconsistency I think should  
> concern us much. The point is that it is important for authors to  
> understand the difference between a void and a non-void element and  
> such an approach reinforces that understanding. It is less important  
> to provide an empty element minimization as XML does. In terms of  
> authoring, I think this would be a very consistent and easy to  
> understand approach that only requires authors to fully understand  
> which elements are void (those whose tag ends with a slash) and  
> which are not (those whose tag does not end with a slash, but  
> instead has a separate closing tag).
>
> Take care,
> Rob
>
> On Dec 31, 2008, at 2:22 PM, Philip Taylor wrote:
>
>>
>> Adam Barth wrote:
>>> On Tue, Dec 30, 2008 at 4:57 PM, Philip Taylor <pjt47@cam.ac.uk>  
>>> wrote:
>>>> http://www.haliburtonrealestate.on.ca/ -- <li><a href="http://www.mls.ca 
>>>> "
>>>> target="_blank" title="Multiple Listing Service" />MLS</a>
>>>>
>>>> http://www.ccitula.ru/ -- <a href="pages/virtv.htm"/> <img
>>>> src=http://www.ruschamber.net/banner/VEru158x50.jpg border=0></a>
>>>>
>>>> http://takasago.shop-pro.jp/ -- <a href="?pid=1912944" /><img
>>>> src="http://img05.shop-pro.jp/PA01015/854/product/1912944_th.jpg"
>>>> class="border" /></a>
>>>>
>>>> http://www.alternativegreetingcards.com/ -- <a href="products.asp? 
>>>> id=57"
>>>> class="submenu" />Wizard of Oz</a>
>>> I'm not an HTML parsing expert, but these examples seem as easy to  
>>> fix
>>> up as other parsing oddities like mis-nested tags (e.g.,
>>> <b>foo<i>bar</b>baz</i>).  Why can't the parser assume that <foo / 
>>> > is
>>> a void element until it finds a </foo> that would otherwise close  
>>> the
>>> tag?  This would permit forward and backward compatible parsing and
>>> ease of authoring.
>>
>> If that rule was introduced, it should apply to all (non-void)  
>> elements, otherwise it would be introducing more self- 
>> inconsistencies in the language. But then it would cause problems  
>> and/or confusion in cases like:
>>
>> <div class="a"><div class="b" />Is this text in the b div?</div>
>>
>> <script src="..." /><script>alert('Does this get executed?')</script>
>>
>> <textarea /><script>alert('Does this get executed?')</script></ 
>> textarea>
>>
>> Also it would have to interact with other aspects of error  
>> handling, like:
>>
>> <table><tr><td />Does this text get foster-parented, and then  
>> somehow get yanked back into the table after reaching the end tag?</ 
>> td>
>>
>> <a href="..." />Does the first a element get closed by an implicit  
>> end tag before the second a element?<a href="...">or is this nested  
>> inside the first a element?</a></a>
>>
>> -- 
>> Philip Taylor
>> pjt47@cam.ac.uk
>>
>
>
Received on Thursday, 1 January 2009 16:54:07 UTC