[whatwg] Allow trailing slash in always-empty HTML5 elements? from Sam Ruby on 2006-11-29 (public-whatwg-archive@w3.org from November 2006)

From: Sam Ruby <rubys@intertwingly.net>
Date: Wed, 29 Nov 2006 11:15:53 -0500
Message-ID: <456DB239.3000902@intertwingly.net>
Lachlan Hunt wrote:
> Sam Ruby wrote:
>> In HTML5, there are a number of elements with a content model of 
>> empty: area, base, br, col, command, embed, hr, img, link, meta, and 
>> param.
>>
>> If HTML5 were changed so that these elements -- and these elements 
>> alone -- permitted an optional trailing slash character, what 
>> percentage of the web would be parsed differently?  Can you cite three 
>> independent examples of existing websites where the parsing would 
>> diverge?
> 
> If it's only allowed on empty elements (now known as "singleton 
> elements" in the spec) then this isn't about changing the handling, it's 
> just about defining what is and is not conforming.

Exactly.

> I do not think it's a good idea to make the trailing slash conforming. 
> Although it is harmless, it provides no additional benefit at all and it 
> creates the false impression that the syntax actually does something.
> 
> The fact is that authors already try things like <div/>, <p/> and even 
> <a/>.  I've seen all of those examples in the wild.  See, for instance, 
> the source of the XML 1.0 spec (and many others) which claim to be XHTML 
> as text/html, littered with plenty of <a/> tags all throughout.

If these are common, and implemented interoperably, then what is the 
harm?  An example of something that is NOT implemented interoperably is 
<script src="..."/>.

In my book, a document that states that it always is a parse error to do 
something despite abundant evidence to the contrary is not as useful as 
one that says here are the places where it works, and here are the 
places where it does not.

> I've even come across various authors either thinking that does work, or 
> (when they find out the truth) wondering why it doesn't.  It's not a 
> good idea to confuse them any more by giving the impression that it 
> works for some elements but not others.  It's better to just say it 
> doesn't work at all and forbid it in all cases.

That's a slippery slope.  At the extreme, it leads to XHTML 2.0, where 
features that are thought to be problematic are removed.  "Think of the 
children".

By contrast, in HTML5, I see a document that attempts to be considerably 
less judgemental, and considerably more resilient.  Inside the comments 
in the HTML 5 document I see statistics lovingly cited.  Example:

         <!-- As of
         2005-12, studies showed that around 0.2% of pages used the
         <image> element. -->

What percentage of pages use <img/> constructs?

>> and all this is coupled with Lachlan's observations[3] on what it 
>> would take to change the popular WordPress application to produce 
>> HTML5 compliant output.
> 
> That just illustrates a fundamental flaw in the way WordPress has been 
> built.  It is a perfect example of a CMS built by a bunch of bozos [1] 
> and cannot be used as an excuse for allowing the syntax.

Be careful when you patronize.

Is there really any excuse for allowing "<b><i>OMG!</b></i>"?  No, but 
HTML5 is willing to pinch its nose with thumb and forefinger and look 
the other way.  It literally is not a battle worth fighting.

>> As a side benefit of this change, I believe that I could modify my 
>> weblog to be simultaneously both HTML5 and XHTML5 compliant, modulo 
>> the embedded SVG content, something that would needs to be discussed 
>> separately.
> 
> No you couldn't, and how would that be a benefit if you could?  XHTML 5 
> requires xmlns, HTML 5 forbids it.  HTML 5 requires <!DOCTYPE html>, 
> XHTML 5 doesn't (though it's still well-formed, so you could get away 
> with it).

The last I saw, HTML 5 is a working draft.  Did I miss a memo?

With Venus, I translate all content into a canonical well formed XML 
format.  This enables people who author filters to the ability to worry 
about a lot less random edge cases.  I've already seen a lot of 
inventiveness when people find that they can apply off the shelf XML 
tools like XPath and XSLT.

I'd gladly put in a <!DOCTYPE html> in my page, the question is: would 
the WHATWG be willing to meet me half way and allow xmlns attributes in 
a very select and carefully prescribed set of locations?

By the way, my experience is that these types of conversations always 
start off bumpy not merely due to the well known limitation of email for 
conveying human emotion.  The problem is deeper than that: there 
literally is no good place to start.  The only way I know how to deal 
with that is to pose, and repeat, concrete and simple questions.  And 
the one that I am posing with this thread is as follows:

     If HTML5 were changed so that these elements -- and these elements
     alone -- permitted an optional trailing slash character, what
     percentage of the web would be parsed differently?  Can you cite
     three independent examples of existing websites where the parsing
     would diverge?

> [1] http://hsivonen.iki.fi/producing-xml/

- Sam Ruby
Received on Wednesday, 29 November 2006 08:15:53 UTC