- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Fri, 11 Jan 2013 12:29:42 +0200
- To: WHATWG <whatwg@whatwg.org>
Hixie wrote in https://www.w3.org/Bugs/Public/show_bug.cgi?id=18669#c31 : > I think it's fine for this not to work in XML, or require XML changes, > or use an attribute like xml:component="" in XML. It's not going to > be used in XML much anyway in practice. I've already had browser > vendors ask me how they can just drop XML support; I don't think > we can, at least not currently, but that's the direction things are > going in, not the opposite. This attitude bothers me. A lot. I understand that supporting XML alongside HTML is mainly a burden for browser vendors and I understand that XML currently doesn't get much love from browser vendors. (Rewriting Gecko's XML load code path has been on my to-do list since July 2010 and I even have written a design document for the rewrite, but actually implementing it is always of lower priority than something else.) Still, I think that as long as browsers to support XHTML, we'd be worse off with the DOM-and-above parts of the HTML and XML implementations diverging. Especially after we went through the trouble of making them converge by moving HTML nodes into the XHTML namespace. But I think it's wrong to just consider XML in browsers, observe that XML in browsers is a burden and then conclude that it's fine for stuff not to work in XML, to require XML changes or to have a different representation in XML. XML has always done better on the server side than on the browser side. I think it's an error to look only at the browser side and decide not to care about XML compatibility anymore. When designing Validator.nu, inspired by John Cowan’s TagSoup, I relied on the observation that XML and valid HTML shared the data model, so it was possible to write an HTML parser that exposed an API that looked like the API exposed by XML parsers and then build the rest of the application on top of XML tooling. In the process of writing the parser, in addition to supporting the XML API I needed for Validator.nu I also added support for a couple of other Java XML APIs to make it easy for others to drop the parser into their XML-oriented Java applications. Then I got my implementation experience documented in the spec as the infoset coercion section. I also advocated for the DOM Consistency design principle in the HTML Design Principles. (I also advocated this approach in the HTML–XML Task Force, and I believe that the feasibility of using an HTML parser to feed into an XML pipeline in addition to making good technical sense for real software has been useful in calming down concerns about HTML among XML-oriented people.) Interestingly, the first ideas that were conceived unaware of these efforts to make HTML parsing feed into an XML-compatible data model and that threatened the consistency of the approach came from the XML side: ARIA with colons (aria:foo instead of aria-foo) and RDFa with prefix mappings relying on the namespace declarations (xmlns:foo). We were successful at getting ARIA to change not to break the data model unification. Since then, RDFa has downplayed the use of xmlns:foo even though it hasn't completely eradicated it from the processing model. Now it seems that threats to DOM Consistency and Infoset compatibility come from the HTML side. The template element radically changes the data model and how the parser interacts with the data model by introducing wormholes. However, this is only browser-side radicalness and a complete non-issue for server-side processes that don't implement browser-like functionality and only need to be able to pass templates through or to modify them as if they were normal markup. These systems don't need to extend the data model with wormholes—they can simply insert the stuff that in browsers would go into the document fragment on the other side of the wormhole as children of the template element. The idea to stick a slash into the local name of an element in order to bind Web Components is much worse. Many people probably agree that the restrictions on what characters you can have in an XML name where a bad idea. In fact, even XML Core thought the restrictions were a bad idea to the extent they relaxed them for the fifth edition. But for better or worse, existing software can and does enforce the fourth edition NCNameness of local names. This isn't about whether the restrictions on XML Names were a good or bad idea in the first place. This isn't about whether it's okay to make changes to the HTML parsing algorithm. This isn't about whether the error handling policy of XML parsing is a bad idea and should be replaced with XML5/XML-ER. This is about how *existing* XML data model *implementations* behave. Sure, the reason why they behave the way they do is that they try to enforce the serializability of the data model as XML 1.0 (4th ed. or earlier) + Namespaces, but that's not the key point. The key point is that NCName enforcement exists out there in software that would be useful for people working with HTML on the server side as long as HTML fits into the XML data model. I think it would be a mistake to change HTML in such a way that it would no longer fit into the XML data model *as implemented* and thereby limit the range of existing software that could be used outside browsers for working with HTML just because XML in browsers is no longer in vogue. Please, let's not make that mistake. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Friday, 11 January 2013 10:30:15 UTC