- From: Sam Ruby <rubys@intertwingly.net>
- Date: Wed, 29 Nov 2006 07:45:14 -0500
Benjamin Hawkes-Lewis wrote: > On Tue, 2006-11-28 at 16:20 -0500, Sam Ruby wrote: > >> I believe that I could modify my weblog to be simultaneously both >> HTML5 and XHTML5 compliant, modulo the embedded SVG content, something >> that would needs to be discussed separately. > > I think having /two/ different serializations of Web Forms 2.0/Web > Applications 1.0 is bad enough. To try and cater to what's effectively a > third serialization compatible with both parsing methods is to reinvent > the "XHTML 1.0 as text/html" mess. Serializing to multiple formats from > a single source is, I think, a better model. Especially as embedded > content may need different treatment too. That was not the intent of my suggestion. I am suggesting that HTML5 standardize on *one* format. One that comes as close as humanly possible to capturing the web as it is practiced in all of its glorious and often quite messy detail. Those that wish to serialize the DOM in other formats are certainly free to do so, but those formats aren't HTML5. I do have an opinion on how embedded content should be handled, but I am trying to focus on one issue at a time. If you would like a preview, take a peek at: http://planet.intertwingly.net/ http://planet.intertwingly.net/top100/ http://golem.ph.utexas.edu/~distler/planet/ Those three planets take input from a number of frankly grungy input sources and consistently produce well formed XML that often contain embedded MathML or SVG content. You are, of course, free to explore those pages and others; but, for now, I would like to focus on one question: If HTML5 were changed so that these elements -- and these elements alone -- permitted an optional trailing slash character, what percentage of the web would be parsed differently? Can you cite three independent examples of existing websites where the parsing would diverge? >> Lachlan's observations [...] on what it would take to >> change the popular WordPress application to produce HTML5 compliant >> output > > As blogging software goes, WordPress is pretty good. But then blogging > software is generally atrocious when it comes to markup. Trying to > design an (X)HTML spec for a group of PHP developers who think it's > persuasive to bang on about their dedication to "web standards" while > serving their project's non-validating XHTML 1.1 homepage as text/html > is doomed to failure. I'm pretty sure that the Mozilla home page was not created with WordPress, and I'm absolutely sure that the Microsoft home page was not. Conversely, if the major browser vendors have to chose between the web as it is commonly practiced, and a spec that doesn't reflect that reality, which one do you think they will chose? I'll argue that the choices aren't as black and white as either the question you posed above, or even the one that I did. No matter what the WHATWG spec says, each vendor will independently make a cost/benefit analysis as to how they should treat trailing slashes in elements like img. But before they do, this work group certainly can anticipate that question. What is the cost of accepting trailing slashes on elements which are always defined with a content model of empty, except when found in "Attribute value (unquoted) state"? What sites would be parsed differently based on this change? Are those differences in line with how existing browsers actually behave, or at odds with this behavior? - Sam Ruby
Received on Wednesday, 29 November 2006 04:45:14 UTC