Re: RDFa and Web Directions North 2009 from Mark Birbeck on 2009-02-13 (public-rdfa@w3.org from February 2009)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Fri, 13 Feb 2009 23:57:35 +0000
To: Sam Ruby <rubys@intertwingly.net>
Cc: Kingsley Idehen <kidehen@openlinksw.com>, Dan Brickley <danbri@danbri.org>, Michael Bolger <michael@michaelbolger.net>, public-rdfa@w3.org, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, Tim Berners-Lee <timbl@w3.org>, Dan Connolly <connolly@w3.org>, Ian Hickson <ian@hixie.ch>, Henri Sivonen <hsivonen@iki.fi>
Message-ID: <ed77aa9f0902131557t73469180uea76bd132eb3f372@mail.gmail.com>
Hi Sam,

> Somehow we both are making true statements, yet missing each other's points.

I have to disagree again.

You seem to be implying that there is a fundamental impediment to
creating an RDFa parser using the tools available in an HTML DOM. You
base this assertion on Henri's document, but all his script shows is
that objects in an HTML DOM don't have namespace information
available.

That's no surprise.

My response is that this is irrelevant.

An RDFa parser needs to be able to 'spot' whether an attribute name
begins 'xmlns:', but for that we don't need namespace support -- it's
just string matching, no different to detecting an attribute like
@data-length [1].


> And I wrote that "HTML parsing rules differ in visible ways from XHTML.
>  Ways that affect the specific names of attributes chose[sic] in RDFa."

But the attributes in RDFa are not prefixed -- @about, @resource,
@datatype and @content are new attributes, whilst @rel, @rev, @href
and @src already exist -- so I don't see in what way the names were
'chosen' in a way that was influenced by XHTML.


> A list of the parsers alluded to above would be helpful as an existence
> proof for the above assertion.

I think you have this the wrong way round.

The parsing algorithm for RDFa refers to attributes and elements,
navigated by recursively traversing the hierarchy. It's therefore
applicable to anything that has such a hierarchical structure, and
that allows attribute values to be retrieved. Both HTML and XHTML DOMs
fit this description.

So I'd like to see a proof that shows that this simple architecture
makes it impossible to create an RDFa parser on top of an HTML DOM.
Henri has not provided a proof of anything other than that an HTML DOM
doesn't support namespaces, yet for some reason this 'non-proof' gets
circulated as fact.


>  For those who wish to replicate such, it
> would be helpful if the list of differences were enumerated and documented
> somewhere, ideally in a Standard somepace.

I don't know what this means. Differences with what?


> My statement was in response to a statement that I have seen often made that
> there are no differences that affect application programmers.  Such a
> statement is provably false.  The people making these statements aren't
> dummies; it simply is the case that the differences are subtle and
> non-obvious and tend to be glossed over by those that know better.

I have no idea what this means.

What is a 'difference that affects application programmers', that is
being glossed over? If you want to write a parser...you just have to
get on and write it. There's no 'easy' and 'difficult' route, and
certainly no 'difficult' route that is being glossed over.

Having said that, I haven't seen any of the kinds of statements that
you refer to; perhaps you can provide some links?


> Your recent statement that "I can assure you that the parsing rules were
> very explicitly written in such a way that the only thing they require to do
> their work is a hierarchy of nodes, and the ability to obtain the value of
> an attribute.", while technically true, tends to obscure more than reveal
> when it comes to these differences.

Again...what differences? I'm still confused as to what it is that
we're being different to.

Just in case what you are getting at is that there is somehow a
difference between parsing RDFa in XHTML and parsing RDFa in HTML, I
can only say again that there isn't -- there is only one parsing
algorithm in RDFa.


> Actually, I say differences.  I only have an existence proof for one
> difference at the moment.  Is there more?  Beats me.  Hence my assertion
> that a definitive list would be helpful.

As I said, the "existence proof" of which you speak (Henri's one),
proves only that namespace properties do not exist in an HTML DOM,
whilst they do in an XHTML DOM.

That's very different from being an "existence proof" that there are
two (or more) algorithms for parsing RDFa in a DOM, since RDFa does
not require namespaces per se.


> I got pulled into this discussion at a point where it was an appeal to
> authority (presumably TimBL), or a questioning of authority (Hixie). Neither
> are particularly productive ways of proceeding.  And Hixie mentions that in
> theory I could play a role in overruling a decision he has made.  All I will
> say on that point is that I would strongly recommend that nobody attempt to
> pursue that path without first doing their homework.

He he.

Don't worry...I'm not trying to get you to sway Hixie, before, during
or after doing my homework.

I think even a cursory look at the positions I've taken over the years
on a variety of topics will show that I will always argue my own
corner, without hiding behind anyone else, or resorting to cunning
'Art of War' style manoeuvres. :)

The only reason I entered this debate was to clarify the single point
that you made, propagating Henri's false claim -- that since the HTML
DOM does not provide namespace information, it is therefore not
possible (or 'more difficult') to create an RDFa parser.

As to whether RDFa will find it's way into HTML5, I'm pretty agnostic.
If people won't listen to the experiences of someone like Ben Adida,
when they talk of the problems that they have solved via RDFa, then I
doubt my voice will make much difference. :)


> Meanwhile, Manu has a list of use cases.  You apparently know of a list of
> existing parsers.  And if somebody could enumerate the complete list of
> differences between HTML and XHTML that such parsers need to be concerned
> about; well, that could certainly qualify as homework.

That will have to be left as an 'exercise for the reader'. I came in
to clarify that there are _no_ differences, and that Henri's 'proof'
proves nothing.

I'm done now, thanks.

Regards,

Mark

[1] <http://dev.w3.org/html5/spec/Overview.html#custom-data-attribute>

-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Friday, 13 February 2009 23:58:16 UTC