Re: RDFa and Web Directions North 2009

On Feb 18, 2009, at 16:08, Mark Birbeck wrote:

>>>> It doesn't represent XML attribute spelled "xmlns:foo" in the XML  
>>>> source
>>>> code as attributes in the API. Thus, if you write a XOM-based  
>>>> consumer for
>>>> RDFa-in-XML as currently defined, you can't just swap the parser  
>>>> to an HTML5
>>>> parser and have it work.
>>>
>>> It appears to me that this could be considered to be either a bug  
>>> in the
>>> HTML5 parser, or in XOM.
>>
>> Absent RDFa, it clearly isn't a bug in either. RDFa is what adds a  
>> problem.
>
> Please see my other email about how this breaks currently working HTML
> documents.

Can you show me a conforming HTML 2.0, 3.2, 4.0, 4.01 or 5 (as drafted  
today) file (or even XHTML 1.0 Appendix C file!) that can't be  
usefully mapped to existing XML APIs the way the HTML 5 spec says?

>>>> currently drafted HTML5 features need the change that exposing
>>>> xmlns:foo-based RDFa would require for consistency with the  
>>>> exposure of
>>>> xmlns:foo in XML.
>>>
>>> So is there a precise requirement in HTML5 that mandates how a  
>>> parser must
>>> expose xmlns:foo when producing SAX events, for instance?
>>
>> No. On the contrary, the parser is explicitly allowed not to expose  
>> them.
>> But obviously, that solution wouldn't work for RDFa as proposed.
>>
>> | If the XML API doesn't support attributes in no namespace
>> | that are named "xmlns", attributes whose names start with
>> | "xmlns:", or attributes in the XMLNS namespace, then the
>> | tool may drop such attributes.
>> http://www.whatwg.org/specs/web-apps/current-work/#coercing-an-html-dom-into-an-infoset
>
> That section doesn't seem relevant to our discussion. It looks to me
> like it relates to what can happen to a DOM that is produced by an
> HTML5 parser, as opposed to what can happen to a document on the way
> in to an HTML5 parser.

It is precisely relevant. The whole purpose of section is to explain  
how non-browser HTML5 parsers are allowed to deviate from browser  
behavior in order to interoperate with XML tools.

> So the comment you quote seems to be saying that a toolset that is
> using an XML API that will *consume* an HTML5-generated DOM, can drop
> attributes that are ambiguous in relation to XML.

Conceptually, it is about that. In practice though, if you have an  
HTML5 parser that exposes a XOM tree, a performant implementation  
builds the XOM tree directly as if the infoset coercion rules had been  
applied onto a DOM tree without ever actually creating a DOM tree in  
memory.

> But I don't read that as implying that the HTML5 parser itself (or any
> of the steps bringing information *in* to it) is free to drop this
> attribute, or any others, and doing so would break
> backwards-compatibility with current browser behaviour.

If you don't read as implying that, the spec isn't as clear as it  
should be. It is meant to imply exactly that.

Dropping attributes indeed is incompatible with what browsers do.  
However, 100% compatibility with browser behavior is not possible when  
an XML API doesn't allow "xmlns:foo" can be represented as an attribute.

The crux of the matter is that without RDFa, the attributes that get  
mangled are random non-conforming cruft that the application layer  
would ignore anyway. RDFa uses the attributes that need mangling for  
something that the application layer might actually care about. That's  
why it's RDFa adding a problem where none existed before.

> So @xmlns should just be 'passed through' the HTML5 parser, and be
> ready for processing by the 'XML API' that your quote refers to:

You can't "pass through" an attribute called "xmlns:foo" to XOM, for  
example. The tree implementation throws if you try to.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Wednesday, 18 February 2009 15:08:15 UTC