Objections to the Change Proposal to generalize the mechanism used for SVG and MathML from Henri Sivonen on 2010-09-29 (www-archive@w3.org from September 2010)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 29 Sep 2010 16:30:27 +0300
To: www-archive <www-archive@w3.org>
Message-Id: <67A6F0AE-6294-4763-8163-AA5463C85650@iki.fi>
(This is a response to the poll at http://www.w3.org/2002/09/wbs/40318/issue-41-objection-poll/. Technical limitations of the polling system prevent posting this response directly into the polling system.)

Objection to using MathML as a role model

In the proposal, extension elements--even ones that might become part of a future version of HTML--go into a non-HTML namespace in the DOM. This is bad for Web authors, because elements in non-HTML namespaces are harder to work with.

In particular, document.createElement("foo"); wouldn't work and authors would have to use the more verbose createElementNS variant. Having to pass a namespace URL to DOM methods is a huge annoyance. As a personal anecdote, I've spent such an unreasonable amount of time looking up and copying and pasting namespace URLs that at one point to avoid spending more time I used a tool hooked into the operating system's accesibility APIs that helped me write namespace URLs by providing macros for the common ones. 

MathML is an existing case of a vocabulary that has been designed to be used in HTML documents (http://www.w3.org/TR/1998/REC-MathML-19980407/chapter1.html). MathML 1.0 didn't have a namespace (http://www.w3.org/TR/1998/REC-MathML-19980407/). Contemporary MathML is in a Namespace because using Namespaces in XML became such a centrol part of W3C policy later.

The HTML parsing algorithm assigns MathML elements in the MathML namespace for consistency with the already implemented and deployed XML side--that is, to avoid making MathML have a chameleon namespace. Downsides of chameleon namespaces have been discussed in http://lists.w3.org/Archives/Public/www-archive/2009Feb/0065.html . See also http://www.elharo.com/blog/software-development/xml/2006/10/26/chameleon-schemas-considered-harmful/ .

As an implementation-oriented performance anectode, the HTML vocabulary used to use chameleon namespaces in Gecko prior to Firefox 3.6. HTML nodes created by the HTML parser were in no namespace while nodes with the same semantics and implemtation classes but created by the XML parser were in the http://www.w3.org/1999/xhtml namespace. When Gecko became HTML5-compliant on this point and removed the chameleon nature of the HTML vocabulary by putting HTML elements always in the http://www.w3.org/1999/xhtml namespace, there was an improvement on performance benchmarks, because the complexity arising from a chameleon namespace was gone.

The side effect of the constraints of not introducing chameleon namespace while not changing the XML side is that authors have to deal with the MathML namespace in HTML DOMs. 

From the author point of view, the different namespace accomplishes only one "benefit": thematic grouping. It doesn't accomplish any collision avoidance, because MathML has been designed to avoid collisions with HTML local names in the first place. Thematic grouping is a lousy benefit compared to not being able to create e.g. a MathML mtext element node by calling document.createElement("mtext");. After all, we don't thematically group other things in different Namespaces in the DOM. For example, we don't have "media" elements (video, audio, source) grouped themetically into a non-HTML namespace. 

Furthermore, it the thematic grouping as math doesn't even accomplish anything as far as stylability goes. Even the relatively contrived case of styling all math with a different font or with a different color could be accomplished by styling <math> and letting the properties inherit.

From the implementation point of view, having MathML elements in a different namespace causes the inconvenient edge case that is http://www.w3.org/Bugs/Public/show_bug.cgi?id=9887

Due to the reasons recounted above, I strongly object to the proposal to make extensions that might become parts of a future standard follow the example of MathML and introduce more namespaces into the DOMs Web authors have to deal with. MathML being in a non-HTML namespace shouldn't be viewed as a positive example. It should be viewed as a learning experience of the trouble caused by the W3C's insistence on using Namespaces in XML so that the mistake wouldn't be repeated by future additions to HTML.

Instead, we should look to the latest XBL2 draft (http://dev.w3.org/2006/xbl2/Overview.html) as a role model. The latest XBL2 draft introduces a thematically grouped set of elements the Web platform but does so by putting the elements into the HTML namespace.


Objection to the backward-compatibility characteristics of extension=""

Since existing browsers that don't support the extension="" attribute will assign extension elements in the HTML namespace, extensions can't really safely use names that collide with HTML element names. The proposal itself even encourages local name collision avoidance. When the local names are, for practical purposes, occupy the same space of names as HTML itself, putting some elements in a different Namespace in the DOM is useless.

I strongly object to introducing more namespace into the DOMs authors have to deal with when proposed namespace proliferation doesn't solve a real name collision problem, because adding complexity is costly and we shouldn't accept a cost without getting something useful in return.


Objection to making colliding local names possible nonetheless

Even though the proposal encourages local name collision avoidance, it permits local name collisions between HTML and extensions. This makes it harder for authors to select elements of a given type without accidentally selecting colliding local names in another namespace when developing scripts or style sheets that are supposed to be reusable. 

document.getElementsByTagName() would potentially return unwanted elements, so a script author would have to use document.getElementsByTagNameNS() with the same problem related to looking up namespace URLs recounted earlier.

CSS element selectors select by local name by default. This means that when working with element selectors, authors are not inconvenieced by HTML and MathML elements being in different namespaces. However, the few case where HTML and SVG have name collisions (<a> in particular) are frequently cited as a problem. Restricting element selectors by namespace requires significantly more complicated syntax than selecting only on local name. 

Furthermore, supporting namespace binding introduces significant complexity to anything that uses selectors outside the context of a full CSS style sheet. In the case of the Selector API, the complexity was eventually removed by not allowing namespace bindings. This design decision makes sense considering the current Web language (HTML, MathML and SVG) considering the relatively few overlaps between HTML and SVG. However, if the change proposal were adopted and local name collisions and namespace proliferated, authors would have trouble using the Selector API or alternatively a more complex version would have to be implemented.

Since local name collisions across namespace cause problems for authors and addressing those problems would cause problems would complicate implementations, I strongly object to introducing more potential for local name collisions. SVG shouldn't be viewed as a positive role model. Instead, local name collisions between HTML and SVG should be viewed as a learning experience of the trouble caused by the W3C's insistence on using Namespaces in XML so that the mistake wouldn't be repeated by future additions to HTML.


Objections to namespaced attributes

The proposal allows for namespaced attributes. The proposal isn't clear on how such attributes are processed. If parsers need to know in advance about the registered prefixes, and xmlns:prefix='' doesn't affect what namespace prefix:foo='' gets assigned into, it would be simpler to use the registry to avoid collisions between names and not bother with the prefix. Furthermore, if xmlns:prefix="ns" didn't affect prefix:foo='', prefix:foo="" would parse into a different DOM depending on the parser's awareness of the registry, which would bad and would potentially lead to software processing such attributes treat them as chameleons between {null, "prefix:foo"} and {"ns", "foo"}.

I strongly object to the namespace of attributes depending on the registry snaphot known to the parser, since it would likely lead to chameleon-like processing in the code that deals with the output of the parser.

On the other hand, if xmlns:prefix="" is meant to affect prefix:foo='', the following concern applies: Namespaced attributes are not particularly common in XML formats close to the Web (apart from xml: which doesn't need declaring and xlink: which is now recognized as a mistake in SVG and MathML). However, namespaced attributes substantially complicate the processing on attributes, since in a tag like <foo bar:baz='' xmlns:bar=''> the later attribute affects the earlier attribute. Adding namespaced attributes looks bad in terms of complexity especially considering that merely enabling namespaces in an XML parser measurably slows down parsing (see http://hsivonen.iki.fi/cost-of-html/).

I strongly object to introducing features that add complexity to processing to the point of having measurable performance effects.

Finally, namespaced attributes are inconvenient for authors to work with both in terms of DOM methods and in terms or CSS selectors compared to attributes that aren't in a namespace. I strongly object to inflicting such inconvenience on authors.


Objections to the rationale

SVG and MathML are in their own Namespaces in the DOM for unfortunate historical reasons. Future "platform extensions", the most immediate of which is XBL2, can be placed in the HTML namespace and won't particularly benefit from being in a different Namespace in the DOM. Platform extensions should be coordinated between vendors in a standards WG anyway, so making it easier to bake platform extensions privately is an anti-feature. Thus, the rationale point about "platform extensions" is moot. I stongly object to treating the rationale point about "platform extensions" as a valid rationale.

The "language extensions" rationale point uses an example that is a data point against what the rationale point is trying to argue. In XML, RDFa had the opportunity of placing the attributes introduced by RDFa into a Namespace. Yet, the creators of RDFa placed the attributes in no namespace. If creators of "language extensions" act like the creators of RDFa, the availability of mechanisms to put attributes in a Namespace is useless. I strongly object to the appeal to RDFa being considered a valid rationale for introducing a mechanism for placing attributes in a namespace.

If "vendor-specific experimental extensions" such as <canvas> had been placed in a namespace, it would have been harder to adopt it into the standard as part of the HTML vocabulary. The mechanism by which the proposal proposes such experimental extensions to become part of the standard either means parts of future standards will be in surprising namespaces in the DOM or that there will be an incentive to introduce elements that have a chameleon namespace. I strongly object to citing "vendor-specific experimental extensions" as a valid rationale for placing elements in new namespaces in the DOM, becase of the risk of this creating an incentive to support the elements as having a chameleon namespace.


-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Wednesday, 29 September 2010 13:31:12 UTC