Re: link/@rel=profile, was: HTML5+RDFa first Editors Draft published from Mark Birbeck on 2009-07-14 (public-rdf-in-xhtml-tf@w3.org from July 2009)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Tue, 14 Jul 2009 10:06:59 +0100
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Manu Sporny <msporny@digitalbazaar.com>, RDFa Developers <public-rdf-in-xhtml-tf@w3.org>, HTMLWG WG <public-html@w3.org>
Message-ID: <ed77aa9f0907140206k6f329511o762c978a3c66774@mail.gmail.com>
Hi Julian,

> I have one question with respect to
> <http://dev.w3.org/html5/rdfa/rdfa-module.html#document-conformance>:
>
> "There has also been strong support from the RDFa Task Force that the
> profile attribute should be retained in HTML5, as it provides an
> "out-of-band" mechanism for signaling that the document contains RDFa. The
> profile attribute may also be used extensively to provide [RDFa Profiles]
> support. Adding profile to the list of rel values and using it to signal
> that the document contains RDFa places document processing instructions into
> the RDF graph, which is problematic."
>
> I'm with you in that I'd like to see head/@profile be carried over from
> HTML4, but I have trouble understanding the last sentence:
>
> "Adding profile to the list of rel values and using it to signal that the
> document contains RDFa places document processing instructions into the RDF
> graph, which is problematic."
>
> How is that different from other link relations, such as "stylesheet",
> "nofollow", whatnot?

I'm struggling to think of a good metaphor here...many apologies.

The core issue is that if you need some information to guide the
processing of a graph, then that information shouldn't be in the graph
itself, but should be in some kind of 'meta' place.

So say we wanted to create the following triple (a graph is a
collection of triples, so this is a graph as well):

  <#me> <http://xmlns.com/foaf/0.1/name> "Julian" .

This graph means that:

  There is something with the name of '#me'...

  ...and it has a property of 'http://xmlns.com/foaf/0.1/name'...

  ...and the value of that property is 'Julian'.

Now, for convenience, we might map the URI
'http://xmlns.com/foaf/0.1/' to a token of 'foaf', so that we can use
it in places. By convention, a mapping like this would take the
following form:

  <#me> foaf:name "Julian" .

But note that what this 'means' is still the same as the first triple:

  <#me> <http://xmlns.com/foaf/0.1/name> "Julian" .

It's merely a convenience that we have 'foaf' as a token, and doesn't
change the underlying meaning.

Anyway, the question is, where do we put the information that tells us
about this mapping? Could we put it into the graph itself?:

  <http://xmlns.com/foaf/0.1/> <http://example.com/mapsto> "foaf" .
  <#me> foaf:name "Julian" .

(Remember, a graph is a collection of triples, so this is now a graph
with two triples.)

The problem with putting information like this, that guides
processing, into the graph, is that we create a lot of potential for
confusion. We now have values in our graph that are supposed to be
plucked out of the graph before the graph can be understood; a
processor would have to scan this graph for values of 'mapsto' before
it could understand the graph...making it inherently brittle.

In other words, to process the graph, you have to first process the graph!

(I'm not a mathematician, but I wouldn't be surprised if Godel would
have something to say about this on a theoretical level.)

So, by convention, we take these mappings out of the graph:

  @prefix foaf: <http://xmlns.com/foaf/0.1/>

  <#me> foaf:name "Julian" .

Now it's clear that the information about the mapping of 'foaf' is
distinct from the graph itself; now a processor would know that it had
to load these mappings first (anything beginning with @prefix), before
using those mappings to help understand the graph.

I realise this probably seems like we're in
counting-angels-on-the-head-of-a-pin territory, but the key thing is
that when you have a collection of data -- be it a relational
database, a CSV file, or some RDF -- you invariably need something
outside of that data that tells you how to process it. And if you mix
this information in with the data, then you have the problem that in
order to process the data, you first need to process it.

I've used prefix mappings to make the case, but @profile also falls
into this category.

A value of @profile on <head> has always been defined in HTML as a way
to instruct a processor about how to interpret tokens that are used in
a document. The mechanism itself has never been defined before, but
the fact remains that the attribute is there for that express purpose.

In RDFa we'd like to use this attribute to be very precise about the
meanings of tokens used in your graphs, and it's ideal, because it's
'out-of-band', in the sense that the information about processing is
not in the resulting graph.

A value of @rel="profile" on the other hand may _seem_ to amount to
the same, but it would fall into the same category as our 'foaf'
example above, in that it would require special pre-processing of the
graph before you can know how to interpret the graph.

Of course, you could say that an RDFa processor shouldn't put triple's
that use @rel="profile" into the graph, but then you no longer have a
generic parser. And you can also be sure that in a few weeks time,
someone will be proposing another value to go into the list of 'values
to do special processing with', and before you know it, we have a
right mess on our hands.

One last thing; the proposal is to use @profile to point to another
document, which might in turn contain further RDFa, telling you how to
map the tokens. That may seem at first sight to fall foul of the same
issue I'm describing, but in fact it's ok -- the data loaded from
subsequent graphs is still out-of-band to the 'primary' graph.

So the key point is not 'don't use RDFa to describe how to interpret
RDFa' -- the key message here is that 'instructions on how to process
a graph cannot go into the graph itself'.

Regards,

Mark

-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Tuesday, 14 July 2009 09:08:43 UTC