W3C home > Mailing lists > Public > whatwg@whatwg.org > November 2006

[whatwg] Allow trailing slash in always-empty HTML5 elements?

From: Sam Ruby <rubys@intertwingly.net>
Date: Wed, 29 Nov 2006 07:45:14 -0500
Message-ID: <456D80DA.5070904@intertwingly.net>
Benjamin Hawkes-Lewis wrote:
> On Tue, 2006-11-28 at 16:20 -0500, Sam Ruby wrote: 
> 
>> I believe that I could modify my weblog to be simultaneously both
>> HTML5 and XHTML5 compliant, modulo the embedded SVG content, something
>> that would needs to be discussed separately.
> 
> I think having /two/ different serializations of Web Forms 2.0/Web
> Applications 1.0 is bad enough. To try and cater to what's effectively a
> third serialization compatible with both parsing methods is to reinvent
> the "XHTML 1.0 as text/html" mess. Serializing to multiple formats from
> a single source is, I think, a better model. Especially as embedded
> content may need different treatment too.

That was not the intent of my suggestion.  I am suggesting that HTML5 
standardize on *one* format.  One that comes as close as humanly 
possible to capturing the web as it is practiced in all of its glorious 
and often quite messy detail.  Those that wish to serialize the DOM in 
other formats are certainly free to do so, but those formats aren't HTML5.

I do have an opinion on how embedded content should be handled, but I am 
trying to focus on one issue at a time.  If you would like a preview, 
take a peek at:

     http://planet.intertwingly.net/
     http://planet.intertwingly.net/top100/
     http://golem.ph.utexas.edu/~distler/planet/

Those three planets take input from a number of frankly grungy input 
sources and consistently produce well formed XML that often contain 
embedded MathML or SVG content.

You are, of course, free to explore those pages and others; but, for 
now, I would like to focus on one question:

     If HTML5 were changed so that these elements -- and these elements
     alone -- permitted an optional trailing slash character, what
     percentage of the web would be parsed differently?  Can you cite
     three independent examples of existing websites where the parsing
     would diverge?

>> Lachlan's observations [...] on what it would take to 
>> change the popular WordPress application to produce HTML5 compliant
>> output
> 
> As blogging software goes, WordPress is pretty good. But then blogging
> software is generally atrocious when it comes to markup. Trying to
> design an (X)HTML spec for a group of PHP developers who think it's
> persuasive to bang on about their dedication to "web standards" while
> serving their project's non-validating XHTML 1.1 homepage as text/html
> is doomed to failure.

I'm pretty sure that the Mozilla home page was not created with 
WordPress, and I'm absolutely sure that the Microsoft home page was not.

Conversely, if the major browser vendors have to chose between the web 
as it is commonly practiced, and a spec that doesn't reflect that 
reality, which one do you think they will chose?

I'll argue that the choices aren't as black and white as either the 
question you posed above, or even the one that I did.

No matter what the WHATWG spec says, each vendor will independently make 
a cost/benefit analysis as to how they should treat trailing slashes in 
elements like img.

But before they do, this work group certainly can anticipate that 
question.  What is the cost of accepting trailing slashes on elements 
which are always defined with a content model of empty, except when 
found in "Attribute value (unquoted) state"?  What sites would be parsed 
differently based on this change?  Are those differences in line with 
how existing browsers actually behave, or at odds with this behavior?

- Sam Ruby
Received on Wednesday, 29 November 2006 04:45:14 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:30 UTC