Re: FPWD Review Request: HTML+RDFa from Mark Birbeck on 2009-09-04 (public-html@w3.org from September 2009)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Fri, 4 Sep 2009 11:33:09 +0100
To: Henri Sivonen <hsivonen@iki.fi>
Cc: Shane McCarron <shane@aptest.com>, Anne van Kesteren <annevk@opera.com>, Manu Sporny <msporny@digitalbazaar.com>, HTML WG <public-html@w3.org>, RDFa Developers <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <640dd5060909040333h56765df1ie93ae170c8252ffb@mail.gmail.com>
Hi Henri,

On Fri, Sep 4, 2009 at 7:43 AM, Henri Sivonen<hsivonen@iki.fi> wrote:
> On Sep 3, 2009, at 18:09, Shane McCarron wrote:
>
> [snip]
>
>> Okay - I have a stupid question.  Right now, today, in HTML5, it is
>> possible for me to declare the 'mathml' and 'svg' namespaces, right?
>
> The assignment of elements to the http://www.w3.org/2000/svg and
> http://www.w3.org/1998/Math/MathML namespaces is not based on declarations.
>
> [points on HTML5+SVG+MathML snipped]

I think the issue about how SVG and MathML are incorporated is not
really relevant here; Shane says he doesn't like how it's been
done...I don't mind it. :)

Speaking personally, I've never liked the fact that the W3C has not
come up with a way of 'hiding' XML namespaces at the mark-up level,
even if they are retained at the infoset level. So whilst there may be
more discussion to have about the underlying mechanism -- and perhaps
some work done to find a generic one -- I completely agree that having
non-prefixed elements in an HTML document is a big win.

(I was on a compound document panel many years ago at a W3C tech
plenary, and argued then that hiding the namespaces was both important
for adoption, and possible. Ah, well...)

Anyway, I digress, but my point is essentially that where any of us
stands on this issue is not relevant for the discussion we're having
about RDFa, HTML5 and namespaces.


>> And if so... can't we just extend that model so that 'xmlns:ANYTHING'
>> follows the same rules?  What am I missing here?
>
>
> Not without code complexity and a backwards-compat risk.
>
> What you are missing here is that the HTML5 parsing algorithm hard-wires 7
> namespace URIs (null, "http://www.w3.org/1999/xhtml",
> "http://www.w3.org/2000/svg", "http://www.w3.org/1998/Math/MathML",
> "http://www.w3.org/XML/1998/namespace", "http://www.w3org/1999/xlink" and
> "http://www.w3.org/2000/xmlns/") but the xmlns:ANYTHING mechanism that RDFa
> calls for uses an open-ended set of URIs and prefixes.
>
> The hard-wiring can be done with pointers or integers or enums interned even
> ahead of compile time, and the HTML5 parsing algorithm never requires the
> parser to examine the contents of an attribute name string beyond comparing
> it for equality. xmlns:ANYTHING is a different beast in terms of code
> complexity, since you go look at the contents of the string and split on
> colon.
>
> You are also missing that if browser vendors take a compatibility risk (or a
> ship date slippage risk) with SVG-in-text/html parsing, they take a risk
> that enables them to get more value out of their *existing* SVG
> implementations but there's no such upside of getting more value out of
> existing code for taking a risk for RDFa.
>
> For a longer treatment, please refer to:
> http://lists.w3.org/Archives/Public/public-html/2009Mar/0163.html

I understand what you are saying about the seven predefined namespace
mappings, and that you perceive RDFa will require the provision of a a
generic way to get at all namespace mappings.

But as I've said in another thread, even if you went to the trouble of
adding such a feature, an RDFa parser isn't going to thank you for it.

An RDFa parser needs only DOM Level 1 support -- i.e., the ability to
iterate a list of attributes, and the ability to get an attribute's
value, by name.

To make this clearer, let's flip this around and ask what features a
DOM might provide, that could help an RDFa parser.

The first feature would obviously be a simple list of namespace prefix mappings.

Sure, that would be nice, but neither DOM1 nor DOM2 supports such a
feature, so it seems pointless for us to define the RDFa parsing
algorithm in terms of that. And consequently, there's no reason that
the HTML5 DOM should provide such a feature just for RDFa, if no other
DOM is going to do so.

The second feature would be to indicate which mappings are 'in scope'
(although without the first feature, I don't know how we'd get hold of
them...).

That might seem handy, but it would actually be completely redundant;
the RDFa parsing algorithm takes care of 'in-scope' mappings itself,
and takes them with it as it moves up and down the DOM, pushing and
popping as it goes. Since there are other things that need to be
tracked too, then there is little an RDFa parser would gain from such
a feature.

I should point out that the reason the algorithm was written this way
-- rather than saying "get current in-scope XML namespaces" -- was to
allow for other ways of providing prefix mappings, such as the @token
and @prefix proposals. By tracking the scope in the parser, we can
effectively use any attribute to indicate mappings.

So as I say, there really is nothing 'special' that an RDFa parser
needs from an HTML5 DOM, and I certainly don't want you to be
concerned that RDFa requires changes to the HTML5 DOM.

As I've said in the other thread, I realise that there are many things
that we could have made clearer on this subject though, and I think
the key thing now is to ensure that further specifications address
this lack of clarity in the existing RDFa spec.


> On Sep 3, 2009, at 20:32, Mark Birbeck wrote:
>
>> I was trying to explain why RDFa cannot be regarded as being "layered"
>> on top of XML namespaces, because it doesn't actually require XML
>> namespaces.
>>
>> RDFa only requires a prefix mapping mechanism, and this could just as
>> easily have been @xmlns-dc, @banana:dc, @samruby="dc=..." or something
>> else entirely.
>
>
> I think the line of reasoning that you could have used something other than
> xmlns:prefix isn't convincing as a way to make xmlns:prefix OK.

Sure...but that's not what I'm getting at.

I'm simply saying that the *algorithm* to get the prefix mapping is
agnostic about the name and format of the attribute that contains the
mapping, and illustrated that with other possibilities.

It's a bit like saying that the algorithm to get the current language is:

  "retrieve a value from an attribute called 'x'"

This algorithm is simple, so the only debate would be what 'x' should be.

Now, if we had chosen '@banana' to hold the language values, people
would have rightly shouted and said "use @lang and @xml:lang" because
they are already being widely used. But the algorithm would have
remained intact, whatever attribute name we had used.

Similarly, the *algorithm* for mapping prefix values is to take the
attribute called 'x' and crack it open to reveal a token, and use its
contents as the mapped value.

Again, the debate would be about what should we use for 'x', and we
felt that for the first version of RDFa, we would have had trouble if
we didn't use the @xmlns:* syntax.


> Or are you
> implying that xmlns:prefix can be removed from RDFa and replaced with
> something else for both HTML and XHTML, because xmlns:prefix isn't an
> essential characteristic of RDFa?

That's not what I'm driving at, but it is certainly true that the RDFa
parsing algorithm was specifically written to avoid any dependencies
on xmlns-based attributes. The wording in the CURIE processing section
will give a good illustration of what I'm talking about:

  <http://www.w3.org/TR/rdfa-syntax/#s_curieprocessing>

My preference would be to clarify and use @xmlns:* now, to maintain
consistency. And then at some point agree an *additional* mechanism --
not a replacement -- such as @prefix, @token, or even something else
altogether.

Regards,

Mark

-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Friday, 4 September 2009 10:33:54 UTC