Re: dataset semantics from Pat Hayes on 2011-12-21 (public-rdf-wg@w3.org from December 2011)

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 21 Dec 2011 00:27:51 -0600
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: Richard Cyganiak <richard@cyganiak.de>, public-rdf-wg@w3.org
Message-Id: <4A406ED5-FC3B-40ED-A725-C9044B93E860@ihmc.us>
On Dec 20, 2011, at 7:34 AM, Antoine Zimmermann wrote:

> Le 20/12/2011 04:52, Pat Hayes a écrit :
>> 
>> On Dec 19, 2011, at 1:50 PM, Richard Cyganiak wrote:
>> 
>>> On 19 Dec 2011, at 10:48, Pat Hayes wrote:
>>>> I would like to see some evidence, from actual use cases, of how it can be that different RDF graphs hold in different contexts,
>>> 
>>> See here for some (toy) examples:
>>> http://lists.w3.org/Archives/Public/public-rdf-wg/2011Oct/0212.html
>> 
>> For comments on that, see my reply
>> http://lists.w3.org/Archives/Public/public-rdf-wg/2011Oct/0228.html
>> 
>>> A list of use cases provided by WG members is on the wiki:
>>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC
>> 
>> After a quick survey, I dont see anything in this list which suggests that a context can or should be thought of as changing the meaning of an IRI.
> 
> The meaning of an IRI is constrained by the triples in the graph in which it occurs.

That can be understood in two ways. One of them is correct, but irrelevant to the discussion here; the other is relevant, but then the claim is profoundly and dangerously wrong. 

The first sense is, that the meaning of an IRI is determined (perhaps in part) by what assertions are made using it, ie in RDF terms, by what RDF graphs it occurs in. Yes, I think that is basically correct, although its might be better to say, it is determined by the totality of all documents in which it occurs. [1] However, with this sense, we must take into account *all* the graphs in which the IRI occurs, or at any rate all those which we trust or accept. (I know this gets into the issue of how to adjudicate such trust, but let me leave that aside for now. It is orthogonal to the present point.) So when we look at two sources, both trusted, both using the IRI in question, then both of them constrain the meaning of the IRI. It is the same IRI in both (or perhaps many) graphs, not a different IRI in each separate graph.

The second sense I can understand what you are saying here is exactly this idea, that one and the same IRI might have a meaning in one graph and a different meaning in a different graph. (Perhaps indeed, that it *must* have a different meaning in a different graph? Note, this is not the same as saying that one graph may be more trusted than the other.) Taken to an extreme, this amounts to the claim that each IRI has a whole spectrum of meanings, determined by the graph in which it appears, and hence that every occurrence of it in a different graph is, in effect, a distinct IRI.  It is difficult to emphasise the extent to which this idea is wrong. It would follow for example that RDF from two different graphs can never be combined to draw any conclusion that could not be derived from one of the graphs alone, that merging two graphs is never a valid operation, and many other consequences that seem to me to be somewhat insane. But in any case, even if we ignore RDF and its semantics, the whole Web is predicated on the basic idea of IRIs as *global* identifiers, which mean (in whatever sense of 'mean' one cares to adopt) the same thing, wherever they are used. (Of course, reality is often scruffier than the idealized design models described by our specifications, but at least the specifications make this basic assumption.)

> Go online, and look at what you find:
> 
> http://www.emse.fr/~zimmermann/data4pat1.rdf
> 
> This URL leads to a document where the IRI <http://www.ihmc.us/groups/phayes/> denotes the number 1.

No. It leads to a document where the assertion is made that I, Pat hayes, am identical to the number 1. This assertion is, I am pleased to report, false. Nevertheless, that is what the document says. If, on the other hand, the UIR in question were interpreted as you say, then it would be true, but vacuous, since it would be asserting that 1=1. 

> 
> Now, go to:
> 
> http://www.emse.fr/~zimmermann/data4pat1.rdf
> 
> In this document, the same IRI denotes number 2.

Again, no. It still denotes me, as it did in the first graph, but this graph says that I am identical to the number 2. Taken together, these have the entailment (in OWL) that the number 1 equals the number 2. Which I hope we all agree is probably not the case; nevertheless, they do indeed entail that, taken together. Whereas, if that URI meant what you claim, these two graphs would have no inferential connection with one another at all, since the <http://www.ihmc.us/groups/phayes/> in the first one would refer to something different from the <http://www.ihmc.us/groups/phayes/> in the second one.

> 
> Eventually, a web crawler will index these two documents and without context, it won't do anything useful.

Hopefully, it might detect the inconsistency. I have no idea what help "context" would be. (Im not even sure what you mean by the word in this, er, context.)

> 
> Then go get:
> 
> http://www.emse.fr/~zimmermann/data4pat.rdf
> 
> Now, this document says that all IRIs denote the same thing.

It says that, indeed, and that is obviously false. It has a whole host of very silly entailments. I havnt checked, but I bet it is formally inconsistent, and that an OWL-Full reasoner would find a contradiction quite rapidly. (An OWL-DL reasoner will spit it out at parse time as illegal.). It is often the case that asserting something obviously false entails a great deal of other nonsense. So? 

> As, according to you, this thing is independent of the context, we can stop making reasoners :)

I can't even understand what this is supposed to mean, so I fail to follow your intended point. 

Pat

[1] However, this idea is by no means universally accepted. David Booth, for example, has argued at length that the meaning of any IRI should be determined by a single 'definitional' graph published by the owner of the IRI. Others have said that the meaning is determined by the intentions of the owner of the IRI, whether or not that intention is made manifest in any Web source. And there are many other positions out there.

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 21 December 2011 06:28:45 UTC