Re: Request to publish HTML+RDFa (draft 3) as FPWD from Jonas Sicking on 2009-09-22 (public-rdf-in-xhtml-tf@w3.org from September 2009)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 22 Sep 2009 16:14:24 -0700
To: Mark Birbeck <mark.birbeck@webbackplane.com>
Cc: Shane McCarron <shane@aptest.com>, Henri Sivonen <hsivonen@iki.fi>, HTMLWG WG <public-html@w3.org>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <63df84f0909221614k11fda734pbb2e033aaa154885@mail.gmail.com>

On Tue, Sep 22, 2009 at 3:58 PM, Mark Birbeck
<mark.birbeck@webbackplane.com> wrote:
> Hi Jonas,
>
>> For example, if I have a DOM and I want to do map the prefix "foo",
>> which of the following algorithms should I use:
>> 1. Call Node.lookupNamespacePrefix as defined by DOM Level 3 using
>> "foo" as the prefix argument.
>> 2. Walk up the parent chain looking for an element with an attribute
>> with localName "foo" and namespace "http://www.w3.org/2000/xmlns/",
>> and then use the value of that attribute.
>> 3. Walk up the parent chain looking for an element with an attribute
>> with tagName "xmlns:foo", and then use the value of that attribute.
>> 4. Walk up the parent chain looking for either the attribute in 2 or
>> 3, and if both are specified use some prioritization order.
>> 5. Walk up the parent chain looking for either the attribute in 2 if
>> the document was parsed as XHTML, or attribute in 3 if the document
>> was parsed as HTML.
>> 6. Do something else?
>>
>> Any of 1 to 5 (as well as possibly 6) seems equally valid to me, and
>> as far as I can tell there really is no specified answer.
>
> I disagree with other comments that have implied that you could use
> any of your 5 proposals -- you can't.
>
> The algorithm is clearly defined in the RDFa spec, and consists of:
>
>  * visit each element in the tree, depth-first;
>
>  * before doing any processing on an element, extract any prefix
> mappings, and add them to
>   the 'current context';
>
>  * before visiting each child element, push the 'current context' to a stack;
>
>  * on completion of a child element, pop the 'current context' back off again.
>
> This algorithm is completely independent of how the prefix mappings
> were obtained.
>
> These mappings could have been provided using the attribute @banana,
> containing syntax like "ex=http://example.org".
>
> (And there is discussion about providing some additional mechanism to
> provide these mappings.)
>
> But for now, the only mechanism available is that any attribute that
> conforms to the pattern described in [XMLNS], is interpreted as
> providing a prefix mapping.
>
> So as you can see, there is no need for namespace support on the DOM
> (option 1), although if it's available it can be used -- that's up to
> the programmer.
>
> And there is definitely no need to traverse up the tree every time you
> need to evaluate a prefix (options 2 to 5), since the RDFa parsing
> algorithm has all of the in-scope prefix mappings available in the
> 'current context'.

First of all, note that I of course don't care about the actual
implementation, but rather what algorithm is implemented. Options 1 to
5 yield different prefix mappings so it certainly matters which one is
chosen. You can implement the algorithm defined by 1 without having a
DOM Level 3 implementation available. One of the important differences
with 1 vs. the others is that 1 is affected by element prefixes rather
than just attributes whose names start with "xmlns".

While Ben Adidas RDFa parser never walks up the parent chain, the
algorithm it implements is equivalent to option 3. Again, I'm not
talking about different implementations, but rather substantially
different algorithms yielding different results from the same DOM.

One of your steps above is:

>  * before doing any processing on an element, extract any prefix
> mappings, and add them to
>   the 'current context';

It needs to be defined exactly how this is done. It sounds like many
people on this thread assume the following algorithm:

1. Parse the document using the parsing algorithm defined in HTML5
2. Serialize the resulting DOM according to specification X
3. Extract the RDFa data from the resulting document according to the
XHTML+RDFa specification.

And skip directly to step 2 if you're starting with a DOM rather than
a HTML document.

This is certainly a valid specification. It'll yield interesting
results in some cases, but at least things are always defined. I just
don't think assuming the above algorithm without specifying results in
a high quality HTML+RDFa specification.

I'll also point out that the tremendous amount of confusion in this
thread seems to point towards that Namespaces in XML is far from easy
to understand. It personally would make me hesitant to build other
specifications on top of that model, but of course anyone else is free
to.

/ Jonas

Received on Tuesday, 22 September 2009 23:15:25 UTC