Re: Skolemization and RDF Semantics

On Apr 16, 2011, at 9:58 AM, Richard Cyganiak wrote:

> On 16 Apr 2011, at 15:46, Steve Harris wrote:
>> My suspicion is that the only way forward would be some text along the lines of: [with apologies for any abuse of terminology]
>> Systems wishing to skolemise bNodes, and expose those skolem constants to external systems (e.g. in query results) SHOULD mint fresh a "fresh" (globally unique) URI for each bNode.
>> All systems performing skolemisation SHOULD do so in a way that they can recognise the constants once skolemised, and map back to the source bNodes where possible.

It is not enough that *they* can so recognize it. It needs to be globally recognizable by any system that has access to the specifications. We need to specify how this can be done. 

>> Systems which want their skolem constants to be identifiable by other systems SHOULD use the .well-known URI prefix.
> A cautious +1 to the above from me.

Cautious +1 from me also, with the qualification noted; but I think there is a problem. Let us ignore the term 'skolemise' for a second, and call it 'minting URIs'. Do we want to say that it is in any sense illegitimate or inappropriate to mint a URI and use it to publish data? Surely not: the spec should not say anything at all that restricts how data can be published in RDF. What if that data is derived from other data which uses a blank node? To be sure, this new data is not entailed by the old data, and in a strict sense has more content, but maybe the publisher of the new URI-laced data knows something that the original data publisher did not know. Maybe they just want to give this thing a name. Again, I don't think we should even appear to be restricting the rights of RDF composers to publish data in any form they feel like doing. 

So, I think that all this careful wording SHOULD be understood to apply only under a special circumstance, where some data is modified by inserting URIs in place of bnodes, *and the new version is claimed to be essentially the same content as the old, bnode, version*. That is, when the new skolemised RDF is not re-published by a new publisher who takes responsibility for it, but is seen rather as a re-rendering or a normalization of the old data, inheriting the original publisher's authority and provenance. 

A way to put the point is to ask, who 'owns' the Skolem URIs that are used in the new (version of the) data? The original publisher can legitimately disclaim all responsibility for them if they have been introduced downstream and outside her control. Perhaps all we need to say is that anyone who replaces a bnode with a URI themselves is the owner of that URI and is responsible for accounting for its meaning; but one  way to discharge this responsibility is to use a legitimate skolem URI which can be recognized as such. 

> I think that the documents should have a section that has some recommendations about when to use and when to avoid blank nodes, along the lines of [1] and [2]. Some text on skolemization could go into that section.

I strongly disagree. I don't even see why anyone should believe this kind of chat-room advice. *Why* is it bad to use bnodes? *Why* is data using them worse than data which does not? Worse in what sense, exactly? Which processes are made more difficult when blank nodes are present? And so forth. If answers to such questions are available, then let us discuss them and publish them if we all agree, but even then only in an informative note, not as part of the spec. 


> Best,
> Richard
> [1]
> [2]

IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile

Received on Saturday, 16 April 2011 22:28:10 UTC