Re: link/@rel=profile, was: HTML5+RDFa first Editors Draft published from Mark Birbeck on 2009-07-16 (public-rdf-in-xhtml-tf@w3.org from July 2009)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Thu, 16 Jul 2009 17:34:13 +0100
To: Steven Pemberton <Steven.Pemberton@cwi.nl>
Cc: Julian Reschke <julian.reschke@gmx.de>, Manu Sporny <msporny@digitalbazaar.com>, RDFa Developers <public-rdf-in-xhtml-tf@w3.org>, HTMLWG WG <public-html@w3.org>
Message-ID: <ed77aa9f0907160934r4abdff26p298ceff6819ce0cd@mail.gmail.com>
Hi Steven,

Aren't you supposed to be on holiday? :)

The issue is not whether the triple should appear in the resulting
graph, but rather that the triple needs to be 'actioned' before you
can even create the graph in the first place -- and that's a problem.

Let's say that the document referred to by the profile contains a
whole set of prefix mappings.

You would be unable to process any statements that make use of those
prefix mappings, until you had loaded the profile.

Which means that the RDFa parser would have to somehow go back and
'tidy up' the triples that didn't have prefix mappings, once the
mappings had been loaded.

Or the parser would have to run two passes, first looking for
@rel="profile", then -- after loading the profile -- using the
mappings for the other triples.

In other words, it's quite fundamental that any information required
for processing a graph -- prefix mappings, default language, base
URLs, character encodings, or anything else you can think of -- must
be obtained prior to processing that graph.

(Which is not to say that this information couldn't come from some
_other_ graph -- it just cannot come from the graph it is going to be
applied to.)

Regards,

Mark

On Thu, Jul 16, 2009 at 5:02 PM, Steven
Pemberton<Steven.Pemberton@cwi.nl> wrote:
> Thanks for describing this so carefully Mark. However, I have to say I
> disagree.
>
> If a document has a statement
>
>        <link rel="profile" href="http://example.com/profile"/>
>
> and a RDFa processor uses this information in the processing, it doesn't
> mean that the resulting graph can't or shouldn't contain
>
>        <> <http://www.w3.org/1999/xhtml#profile>
> <http://example.com/profile>
>
> By the time you have the graph, you have done the RDFa processing. The
> statement "This document uses a profile of http://example.com/profile" is
> true, whether or not the profile has been used in the generation of the
> graph. There is no contradiction, Gödelian or other.
>
> Best wishes,
>
> Steven
>
> On Tue, 14 Jul 2009 11:06:59 +0200, Mark Birbeck
> <mark.birbeck@webbackplane.com> wrote:
>
>
>> Hi Julian,
>>
>>> I have one question with respect to
>>> <http://dev.w3.org/html5/rdfa/rdfa-module.html#document-conformance>:
>>>
>>> "There has also been strong support from the RDFa Task Force that the
>>> profile attribute should be retained in HTML5, as it provides an
>>> "out-of-band" mechanism for signaling that the document contains RDFa.
>>> The
>>> profile attribute may also be used extensively to provide [RDFa Profiles]
>>> support. Adding profile to the list of rel values and using it to signal
>>> that the document contains RDFa places document processing instructions
>>> into
>>> the RDF graph, which is problematic."
>>>
>>> I'm with you in that I'd like to see head/@profile be carried over from
>>> HTML4, but I have trouble understanding the last sentence:
>>>
>>> "Adding profile to the list of rel values and using it to signal that the
>>> document contains RDFa places document processing instructions into the
>>> RDF
>>> graph, which is problematic."
>>>
>>> How is that different from other link relations, such as "stylesheet",
>>> "nofollow", whatnot?
>>
>> I'm struggling to think of a good metaphor here...many apologies.
>>
>> The core issue is that if you need some information to guide the
>> processing of a graph, then that information shouldn't be in the graph
>> itself, but should be in some kind of 'meta' place.
>>
>> So say we wanted to create the following triple (a graph is a
>> collection of triples, so this is a graph as well):
>>
>>  <#me> <http://xmlns.com/foaf/0.1/name> "Julian" .
>>
>> This graph means that:
>>
>>  There is something with the name of '#me'...
>>
>>  ...and it has a property of 'http://xmlns.com/foaf/0.1/name'...
>>
>>  ...and the value of that property is 'Julian'.
>>
>> Now, for convenience, we might map the URI
>> 'http://xmlns.com/foaf/0.1/' to a token of 'foaf', so that we can use
>> it in places. By convention, a mapping like this would take the
>> following form:
>>
>>  <#me> foaf:name "Julian" .
>>
>> But note that what this 'means' is still the same as the first triple:
>>
>>  <#me> <http://xmlns.com/foaf/0.1/name> "Julian" .
>>
>> It's merely a convenience that we have 'foaf' as a token, and doesn't
>> change the underlying meaning.
>>
>> Anyway, the question is, where do we put the information that tells us
>> about this mapping? Could we put it into the graph itself?:
>>
>>  <http://xmlns.com/foaf/0.1/> <http://example.com/mapsto> "foaf" .
>>  <#me> foaf:name "Julian" .
>>
>> (Remember, a graph is a collection of triples, so this is now a graph
>> with two triples.)
>>
>> The problem with putting information like this, that guides
>> processing, into the graph, is that we create a lot of potential for
>> confusion. We now have values in our graph that are supposed to be
>> plucked out of the graph before the graph can be understood; a
>> processor would have to scan this graph for values of 'mapsto' before
>> it could understand the graph...making it inherently brittle.
>>
>> In other words, to process the graph, you have to first process the graph!
>>
>> (I'm not a mathematician, but I wouldn't be surprised if Godel would
>> have something to say about this on a theoretical level.)
>>
>> So, by convention, we take these mappings out of the graph:
>>
>>  @prefix foaf: <http://xmlns.com/foaf/0.1/>
>>
>>  <#me> foaf:name "Julian" .
>>
>> Now it's clear that the information about the mapping of 'foaf' is
>> distinct from the graph itself; now a processor would know that it had
>> to load these mappings first (anything beginning with @prefix), before
>> using those mappings to help understand the graph.
>>
>> I realise this probably seems like we're in
>> counting-angels-on-the-head-of-a-pin territory, but the key thing is
>> that when you have a collection of data -- be it a relational
>> database, a CSV file, or some RDF -- you invariably need something
>> outside of that data that tells you how to process it. And if you mix
>> this information in with the data, then you have the problem that in
>> order to process the data, you first need to process it.
>>
>> I've used prefix mappings to make the case, but @profile also falls
>> into this category.
>>
>> A value of @profile on <head> has always been defined in HTML as a way
>> to instruct a processor about how to interpret tokens that are used in
>> a document. The mechanism itself has never been defined before, but
>> the fact remains that the attribute is there for that express purpose.
>>
>> In RDFa we'd like to use this attribute to be very precise about the
>> meanings of tokens used in your graphs, and it's ideal, because it's
>> 'out-of-band', in the sense that the information about processing is
>> not in the resulting graph.
>>
>> A value of @rel="profile" on the other hand may _seem_ to amount to
>> the same, but it would fall into the same category as our 'foaf'
>> example above, in that it would require special pre-processing of the
>> graph before you can know how to interpret the graph.
>>
>> Of course, you could say that an RDFa processor shouldn't put triple's
>> that use @rel="profile" into the graph, but then you no longer have a
>> generic parser. And you can also be sure that in a few weeks time,
>> someone will be proposing another value to go into the list of 'values
>> to do special processing with', and before you know it, we have a
>> right mess on our hands.
>>
>> One last thing; the proposal is to use @profile to point to another
>> document, which might in turn contain further RDFa, telling you how to
>> map the tokens. That may seem at first sight to fall foul of the same
>> issue I'm describing, but in fact it's ok -- the data loaded from
>> subsequent graphs is still out-of-band to the 'primary' graph.
>>
>> So the key point is not 'don't use RDFa to describe how to interpret
>> RDFa' -- the key message here is that 'instructions on how to process
>> a graph cannot go into the graph itself'.
>>
>> Regards,
>>
>> Mark
>>
>
>
>



-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Thursday, 16 July 2009 16:34:55 UTC