- From: James Graham <jgraham@opera.com>
- Date: Fri, 04 Sep 2009 12:52:54 +0200
- To: Mark Birbeck <mark.birbeck@webbackplane.com>
- CC: Henri Sivonen <hsivonen@iki.fi>, Anne van Kesteren <annevk@opera.com>, Manu Sporny <msporny@digitalbazaar.com>, HTML WG <public-html@w3.org>, RDFa Developers <public-rdf-in-xhtml-tf@w3.org>
Mark Birbeck wrote:
> The original objection was that different processing is required for
> different DOMs, and I think we've shown that's not the case; all that
> is required is to iterate through the list of atttributes, and pull
> out those that begin "xmlns:".
It seems to me this is empirically untrue. Consider the case where one
tries to write an RDFa processor in python using lxml and html5lib with
the lxml treebuilder. One will soon run into the following problem:
>>> from lxml import etree
>>> root = etree.fromstring("<html xmlns='http://www.w3.org/1999/xhtml'
xmlns:foo='http://foo.example'></html>")
>>> root.tag
'{http://www.w3.org/1999/xhtml}html'
>>> root.attrib
{}
>>> root.nsmap
{None: 'http://www.w3.org/1999/xhtml', 'foo': 'http://foo.example'}
>>> import html5lib
>>> tree = html5lib.parse("<html xmlns='http://www.w3.org/1999/xhtml'
xmlns:foo='http://foo.example'></html>", treebuilder="lxml")
>>> root = tree.getroot()
>>> root.tag
'{http://www.w3.org/1999/xhtml}html'
>>> root.attrib
{'xmlns': 'http://www.w3.org/1999/xhtml', 'xmlnsU0003Afoo':
'http://foo.example'}
>>> root.nsmap
{None: 'http://www.w3.org/1999/xhtml'}
Clearly the tree produced using XML and the tree produced using html5lib
will require different processing. Using a non-namespace aware XML
processor would still result in problems since the tag name would be
different in the two cases.
Obviously this is not, as stated, strictly a "DOM" consistency issue
since it uses lxml rather than DOM for its tree model. Nevertheless, it
does demonstrate why one cannot pretend that the use of xml namespaces
to establish prefix bindings is an unimportant detail that can be swept
under the carpet.
Received on Friday, 4 September 2009 10:53:58 UTC