- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 23 May 2008 10:05:00 +0000 (UTC)
- To: HTML WG <public-html@w3.org>
On Thu, 8 May 2008, David Orchard wrote: > > The HTML5 specification does not have a mechanism to allow decentralized > parties to create their own languages, typically XML languages, and > exchange them in HTML5 text/html serializations. Indeed. This is by design. > This would allow languages such as SVG, MathML, FBML and a host of > others to be included. At one point, an editors version of the HTML5 > specification contained a subset and reformulation of SVG and MathML. > Tim Berners-Lee described this incorporation of SVG and MathML without > namespaces as horrific and the issue raiser completely concurs with the > him. The assumption that making HTML into a generic syntax is desireable, or that having generic syntaxes available for web authors to arbitrarily extend the Web platform with custom vocabularies is desireable, is not one that I agree with. The Web platform is, frankly, too important to let people extend it without inviting the entire Web community to take part in the extension process. Allowing any vendor to extend the platform is how we end up with <blink>, <marquee>, or <layer>. In practice there are very few vocabularies introduced to the platform over time, and the cost of adding new syntax each time has been utterly eclipsed by the cost of adding the functionality. For example, the total time spent on adding MathML to text/html was a month at most, compared to many years for designing MathML itself. > This issue limits the ability of non-HTML5 working groups to define > languages as the languages must be "brought into" the HTML5 language. Right, that's the idea. > In the end, the problem could result in the text/html serialization > rules becoming the standard serialization rules for XML languages, > replacing XML itself. This could occur if every decentralized language > has a choice between the XML serialization, the text/html serialization > or both. In many cases, the language may choose the text/html > serialization. The HTML serialisation is not a generic syntax. It's a very vocabulary- specific syntax that has evolved organically through the involvement of multiple vendors and a lot of seemingly random chance. It's not a generic syntax like XML, and suggesting that XML could somehow be replaced by HTML is like saying that JSON could somehow be replaced by Python. In addition to all the above, there is also a technical problem with the idea of adding a generic syntax to HTML. I'm honestly not sure it's possible. The Web is a unique ecosystem with adoption characteristics that tend to make this kind of thing hard to deploy. For example, scenarios like the following are common: We start our story with author A, browser B, and feature F. The spec introduces new feature F, which relies on there not being any content using the syntax of feature F already on the Web. (That already is hard to arrange, but lets assume for the purposes of this discussion that we could find some syntax that nobody had yet used.) Browser B implements F. Author A uses feature F on his site, a demo site for cutting edge features, testing with browser B. He also uses some other Cool things. Call the Cool things C. Now in our story we introduce another Web browser W and another developer D. Developer D looks at author A's site using browser W, and likes the cool things C that author A did. Cool things C work fine in Web browser W, although Feature F doesn't, and Web browser W ignores Feature F altogether. Developer D has no idea that Feature F exists, nor what it does, nor does he care. He does, however, like the Cool things C. He copies the code of Author A's site into his site. Developer D happens to run a big site, but he's not very good. He copies Feature F along with Cool things C, and mangles them a bit in the copying process as he adjusts Cool things C to work for his site. Developer D tests with Web browser W and all is great. Unbeknownst to Developer D, Browser B renders his site terribly, because Feature F inteferes with how Developer D intends his site to be processed. The implementors of Browser B end up forced to change their handling of Feature F, possibly removing it altogether. The spec has failed. If you think this is farfetched, consider the random, incomplete, and ill-formed SVG and MathML fragments that already exist in text/html markup today, before Author A even has any reason to deploy SVG and MathML (aka Feature F) on his site. With the MathML stuff in HTML5, the spec has been very carefully designed to have simple and effective "bail out" behaviour in case the scenario above happens. We can do that because MathML is a specific vocabulary that we can plan for. We don't need to be especially generic. We don't have to handle any random markup, only MathML and HTML mixed in specific ways. I'm not convinced that it is possible to design a generic syntax that is resilient in the face of the above developer behaviour. Even if I thought that such generic syntax was desireable, we would need a very concrete proposal before even considering this. As this is the editors' response to Issue 41, I have marked the issue closed, as recommended by the chairs. I presume this isn't going to satisfy you, but that you don't have anything further to say that hasn't already been said (after all, this discussion has been had to death over the past few years). I believe your next recourse if you want to override my proposal (rejecting the issue) is to ask the chairs to consider whether to bring this to a working group vote, but I could be wrong, I'm not sure. (If you _do_ have new information that hasn't previously been brought forward on this issue, feel free to reopen the issue and mail this further information to the list.) -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 23 May 2008 10:05:41 UTC