Re: FPWD Review Request: HTML+RDFa

Mark Birbeck wrote:

> The original objection was that different processing is required for
> different DOMs, and I think we've shown that's not the case; all that
> is required is to iterate through the list of atttributes, and pull
> out those that begin "xmlns:".

It seems to me this is empirically untrue. Consider the case where one 
tries to write an RDFa processor in python using lxml and html5lib with 
the lxml treebuilder. One will soon run into the following problem:

 >>> from lxml import etree
 >>> root = etree.fromstring("<html xmlns='http://www.w3.org/1999/xhtml' 
xmlns:foo='http://foo.example'></html>")
 >>> root.tag
'{http://www.w3.org/1999/xhtml}html'
 >>> root.attrib
{}
 >>> root.nsmap
{None: 'http://www.w3.org/1999/xhtml', 'foo': 'http://foo.example'}


 >>> import html5lib
 >>> tree = html5lib.parse("<html xmlns='http://www.w3.org/1999/xhtml' 
xmlns:foo='http://foo.example'></html>", treebuilder="lxml")
 >>> root = tree.getroot()
 >>> root.tag
'{http://www.w3.org/1999/xhtml}html'
 >>> root.attrib
{'xmlns': 'http://www.w3.org/1999/xhtml', 'xmlnsU0003Afoo': 
'http://foo.example'}
 >>> root.nsmap
{None: 'http://www.w3.org/1999/xhtml'}

Clearly the tree produced using XML and the tree produced using html5lib 
will require different processing. Using a non-namespace aware XML 
processor would still result in problems since the tag name would be 
different in the two cases.

Obviously this is not, as stated, strictly a "DOM" consistency issue 
since it uses lxml rather than DOM for its tree model. Nevertheless, it 
does demonstrate why one cannot pretend that the use of xml namespaces 
to establish prefix bindings is an unimportant detail that can be swept 
under the carpet.

Received on Friday, 4 September 2009 10:53:56 UTC