Re: RDFa and Web Directions North 2009 from Henri Sivonen on 2009-02-18 (public-rdfa@w3.org from February 2009)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 18 Feb 2009 17:45:08 +0200
To: Sam Ruby <rubys@intertwingly.net>
Cc: Julian Reschke <julian.reschke@gmx.de>, Mark Birbeck <mark.birbeck@webbackplane.com>, Ben Adida <ben@adida.net>, Karl Dubost <karl@la-grange.net>, Kingsley Idehen <kidehen@openlinksw.com>, Dan Brickley <danbri@danbri.org>, Michael Bolger <michael@michaelbolger.net>, public-rdfa@w3.org, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, Tim Berners-Lee <timbl@w3.org>, Dan Connolly <connolly@w3.org>, Ian Hickson <ian@hixie.ch>
Message-Id: <EBF2D787-F7C9-49C0-982C-E042309F5718@iki.fi>

On Feb 18, 2009, at 16:51, Sam Ruby wrote:

> Henri Sivonen wrote:
>> On Feb 18, 2009, at 16:15, Sam Ruby wrote:
>>> Henri Sivonen wrote:
>>>> While HTML5 parsing as defined today supplies namespace  
>>>> information for each element and attribute name, it doesn't  
>>>> supply xmlns:foo-based namespace mapping context for resolving  
>>>> prefixes in attribute content on the application layer.
>>>
>>> If you could put aside your understandable dislike for prefixes in  
>>> any form for a moment:
>>>
>>>   How would renaming xmlns:foo="bar" to prefix="foo=bar" (or
>>>   whatever) address the issue you describe above?
>> Case xmlns:foo="bar" gets the following treatment:
>> With XML parsing and the XOM model, there is no attribute  
>> "xmlns:foo". That is, there'd be no arguments for  
>> getAttributeValue() (on an Element object) that would return "bar".  
>> However, getNamespaceURI("foo") would return "bar".
>> With HTML parsing and the XOM model, getNamespaceURI("foo") would  
>> return null. Depending on parser configuration, there either would  
>> be nothing in the model corresponding to xmlns:foo="bar" or  
>> getAttributeValue("xmlnsU00003Afoo") would return "bar".
>> Thus, the model would be inconsistent in the HTML and XHTML cases.  
>> (Unless HTML5 parsing is changed to accommodate RDFa.)
>> Case prefix="foo=bar" would get the following treatment both for  
>> XML and HTML parsing:
>> In XOM, getNamespaceURI("foo") would return null.  
>> getAttributeValue("prefix") would return "foo=bar".
>> Hence, the model would be consistent in the HTML and XHTML cases  
>> without changes to HTML5 parsing.
>> SAX2 would work in an analogous way.
>
> XML is defined in terms of the W3C DOM.

No, XML is defined in terms of a sequence of characters. The  
relationship to DOM is implied and post facto made more explicit in  
Infoset and DOM Level 3 Core.

> HTML5 is defined in terms of the W3C DOM.

HTML5 is defined in terms of DOM5 which is an update to DOM Level 2  
HTML. DOM5 extends Web DOM Core (http://simon.html5.org/specs/web-dom-core 
) which is analogous to DOM Level 2 Core.

Compared to the W3C DOM, Web DOM Core changes the return value of  
getAttribute to null when there's no attribute with the name supplied  
and rejects many of the Java/server-oriented additions of DOM Level 3.

> Applications built only only W3C DOM interfaces can be completely  
> unaware of the differences between the HTML5 and XHTML5  
> serializations as long as they limit themselves to the non-NS- 
> suffixed APIs.  Correct?

Correct in Firefox and Safari, I think, excluding language  
declarations (lang & xml:lang). Not entirely correct in Opera.  
Probably mostly correct in Xerces DOM in Java; the cases where it  
throws are non-obvious. It doesn't seem to throw here. Not correct for  
attributes in the XLink namespace in MathML and SVG subtrees if that  
part of the spec is taken into account.

Note however, that this adds to the cognitive load of the application  
programmer, because with SVG (the XLink parts) using the non-NS- 
suffixed APIs leads to incorrect code, the NS-suffixed variants would  
need to be used there. Also, you'd need to use createElementNS for  
*element* creation. The Namespace-wise sound programming model always  
uses the Level 2 variants.

> XOM and SAX2 are alternate mechanisms for accessing XML.  I'm not  
> aware of HTML5 equivalents.  But if there were equivalents for these  
> two, it is an open question as to whether such interfaces would  
> behave the same as if they were handing XML documents, or if they  
> could be optimized based on additional constraints on how attributes  
> are used.

The whole point is that there shouldn't have to be HTML5 equivalents  
but non-browser applications should be able to reuse the existing  
libraries designed for XML (except the XML parser itself) thereby  
benefiting from the existing network of off-the-shelf XML-oriented  
tools.

Validator.nu, for example, benefits from the pre-existing Jing RELAX  
NG engine this way.

Having separate HTML5 equivalents for these APIs would lead to  
different application-level code for processing HTML5 and XHTML5,  
which is exactly what the DOM Consistency design principle tries to  
avoid.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Wednesday, 18 February 2009 16:18:50 UTC