Re: Automatic XML namespaces from James Graham on 2009-11-06 (public-html@w3.org from November 2009)

From: James Graham <jgraham@opera.com>
Date: Fri, 06 Nov 2009 11:33:46 +0100
To: Aryeh Gregor <Simetrical+w3c@gmail.com>
CC: Henri Sivonen <hsivonen@iki.fi>, Liam Quin <liam@w3.org>, HTML WG <public-html@w3.org>
Message-ID: <4AF3FB8A.7080408@opera.com>

Aryeh Gregor wrote:
> On Thu, Nov 5, 2009 at 2:26 PM, Henri Sivonen <hsivonen@iki.fi> wrote:
>> How would this differ from what HTML5 specifies now?
> 
> My impression is that for browsers, it would not differ.  They would
> use a hardcoded list of namespace mappings and treat unknown elements
> just as they do now.  As far as I can tell, this proposal aims to let
> XML-processing tools treat namespaces much like HTML5 does, specifying
> the automatic namespace feature of HTML5 in a more broadly usable way.

I am rather confused about how this proposal is supposed to work in the 
browser case (and whether that depends on text/html vs 
application/xhtml+xml). I see a couple of possibilities:

* The namespace file is never read by browsers. They build the rules 
into the parser, possibly in a way that is amenable to non-upgrade-cycle 
linked updates.

  In this case I don't see how this proposal enables distributed 
extensibility (possibly that is not a design goal but I somewhat assume 
it is). In fact if one tries to introduce an extension and gets some 
level of browser support and uses javascript to fake support in other 
the situation is somewhat worse than now since you need to deal both 
with the case that the new elements are put into the HTML namespace and 
the case that they are put into the new namespace.

This interpretation of the proposal also means that browser and 
non-browser tools will interpret documents differently at a rather 
fundamental level (one will read namespace files the other will not). 
This kind of context-sensitive processing seems to cause a large number 
of problems (c.f. validating vs non-validating XML processors).

* By default authors provide no namespace file but if one is provided it 
must be read by the browser.

Obviously this allows for new namespaces to be added on a per page basis 
without authors having to put lots of explicit namespace syntax on the 
page itself. However, as pointed out it requires a synchronous HTTP 
request to fetch the namespace definition file before parsing can 
continue. This seems extremely bad from a perf. point of view. If the 
namespace file is not received for whatever reason (e.g. it goes 404), 
the whole meaning of the page changes. This is a particular issue if 
users start linking to namespace files on other domains, something that 
seems likely if they are using some third-party vocabulary and using a 
namespace file provided by that third party.

If the namespace file is actually processed, many of the problems of 
namespaces in xml remain. For example content is still fragile under 
copy-and-paste.

* Something else

Maybe I missed the point and the proposal is not well0characterised by 
either of the options above?

Irrespective of the above there are some things I don't understand about 
the proposal. How does an element like svg:foreignObject work? What 
happens with multiple attributes with the same name but in different 
namespaces on the same element (xlink:href vs href)? Would authors be 
expected to use traditional namespace syntax in these cases? Since these 
scenarios, particularly the foreignObject one, seem likely in the type 
of mixed-namespace content one would expect to see on the web. If you 
need traditional syntax in these cases then it is unclear that the 
proposal offers a significant simplification for authors.

Received on Friday, 6 November 2009 10:33:52 UTC