Re: RDF Concepts - IRIs do *not* always denote the same resource

On Oct 3, 2013, at 9:50 PM, David Booth wrote:

> Hi Eric,
> 
> I'm copying Pat Hayes, Sandro Hawke and Gregg Reynolds also, because we've been discussing this topic already here:
> http://tinyurl.com/oa2vo9k
> 
> On 10/03/2013 01:38 AM, Eric Prud'hommeaux wrote:
>> * David Booth <david@dbooth.org> [2013-10-02 01:05-0400]
>>> In https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html
>>> I see this statement:
>>> 
>>>   "IRIs have global scope: Two different appearances of an IRI
>>>   denote the same resource."
>>> 
>>> This is wrong.  If it were true then there could never be a URI Collision
>>> http://www.w3.org/TR/webarch/#URI-collision
>>> and there would be no point in the AWWW discussing it or admonishing
>>> against it.
>> 
>> Seeking clarification, are you saying that Concepts should
>> permit/encourage the use of a single IRI to mean multiple things?
>> The AWWW document that you cited takes the opposite stance:
>> 
>> [[
>> By design, a URI identifies one resource. Using the same URI to
>> directly identify different resources produces a URI collision.
>> Collision often imposes a cost in communication due to the effort
>> required to resolve ambiguities.
>> ]] — http://www.w3.org/TR/webarch/#URI-collision
> 
> No.  Use of the same URI to mean multiple things certainly should be discouraged.  But it is a reality and a trade-off, and we should acknowledge and deal with that reality and trade-off, rather than pretending that it doesn't happen and attempting to base the specifications on an obviously false assumption.
> 
> This is not the same as saying: "Non-conforming XML documents exist, therefore the XML specifications should acknowledge and deal with them".  It is perfectly possible to write conforming XML documents, but it is *not* possible to avoid URI collision, because ambiguity is inescapable and ambiguity leads to URI collision, as the example below of G1 and G2 illustrates.

I think this is mistaken, but as the point is subtle it will take a while to get clear. Let me carefully distinguish some claims. 

1. Ambiguity is inescapable. That is, if you use an IRI then its meaning to me will not be completely unambiguously clear, in the sense that I will know or be able to discover *exactly* what interpretation of that IRI you had in mind. 

2. Ambiguity leads to IRI collision. That is, if your use of an IRI is unclear or ambiguous to me, then my use of that same IRI will inevitably produce an IRI collision between our uses. 

3. David's example illustrates both of these points.

I agree with David that 1 is true. I do not agree that 2 or 3 are true. 

FIrst, as I understand the idea of IRI collision. it means that two parties both use the IRI to identify different things, and this difference matters in some sense to what they use them for. In the canonical example, using the same IRI to refer both to a movie and to a discussion forum about that movie, means that applying a creation date to that IRI would have detectably distinct meanings. But this is something stronger than saying that the IRI is ambiguous. If ambiguity is a kind of blurring, then the collision case of more like two sharp points which do not coincide, rather than two points inside a blurry boundary. In the movie/forum case, each user is quite unambiguous about their intended use, but those uses fail to match. That is quite different from the case where one user is imprecise about a reference and fails to convey a precise intention to another. Of course, imprecision may mask a real difference of intention, but this is not inevitable. One gets a collision from an ambiguity only when one party makes a presumption which the other party would reject, ie when the difference of opinion becomes detectable in what the parties actually assert using (or more generally, in what uses they make of) the IRIs in question. The kind of all-pervasive ambiguity which point 1 above refers to does not, in general, have this character. (Of course, it means that sincerely wellmeaning users may discover that their uses are clashing when they independently add presumptions to what other users say, revealing real differences of intent which were not at first evident. And this is a fact of life, indeed, that no amount of Web magic can overcome. Still, this does not mean that such clashes and collisions are inevitable or all-pervasive, only that they are possible.) 

And this is why David's example is not illustrative of either point 1 or point 2. Regarding point 2, the example contains an *explicit* contradiction between the two graphs. No ambiguity is involved here: they explicitly disagree, up-front and in-your-face, in way that is mechanically detectable. So it does not show that ambiguity leads to collision. If the uses of the common IRI were genuinely ambiguous, then there would be no overt, detectable, contradiction between the graphs (just delete the owl:differentFrom assertions, in the example) and it is then no longer clear that there is any collision. (Maybe the two lights are wired up differently.) And regarding point 1, the pervasive ambiguity of having multiple IRI interpretations is still present in this example, yet this ambiguity is presumably irrelevant to the hypothetical use of the RDF to switch lights on and off. Which is typical: the fact that a name like "Everest" is infinitely ambiguous is irrelevant to its utility as a referring term in a sentence like "Everest was first climbed in 1953". Maybe your idea of Everest does not quite exactly coincide with mine, still we can agree that this sentence is true and contains nontrivial information about the real world, and even agree about what can be correctly inferred from it, for example that it is now 60 years since Everest was first climbed. The undetectable differences in our private notions of mountaineering-reference are harmless to our communication and do not constitute a collision or clash of meaning. 


>>> An IRI can and often does denote different resources in different
>>> *interpretations*.  And this, in practice, means that an IRI often
>>> denotes different resources in different *graphs*, because any graph
>>> has a set of satisfying interpretations, and different graphs may
>>> have different sets of satisfying interpretations.  For example,
>>> suppose graphs g1 and g2 have sets of satisfying interpretations s1
>>> and s2, respectively, and those sets may be disjoint.  Then
>>> colloquially (and technically) we can say that an IRI may map to one
>>> resource in g1 (i.e., in some interpretation in s1) and a different
>>> resource in g2 (i.e., in some interpretation in s2).
>>> 
>>> This requires thinking about graphs in terms of sets of satisfying
>>> interpretations -- an important and valid perspective -- rather than
>>> assuming that one looks at them only through the lens of a single
>>> interpretation.
>>> 
>>> As a simple example of how a URI can denote different things in
>>> different graphs, suppose Alice sends this graph G1 from her smart
>>> phone to her home computer to turn *on* her porch light (assuming
>>> the usual URI prefix definitions):
>>> 
>>> G1: {  @prefix db: <http://dbooth.org/>
>>>        ex:alicePorchLight rdf:value db:x .
>>>        db:x owl:sameAs ex:on .
>>>        ex:on owl:differentFrom ex:off . }
>>> 
>>> and her light turns on.
>>> 
>>> In contrast, Bob sends this graph G2 from his smart phone to his
>>> home computer to turn *off* his oven:
>>> 
>>> G2: {  ex:bobOven rdf:value db:x .
>>>        db:x owl:sameAs ex:off .
>>>        ex:on owl:differentFrom ex:off . }
>>> 
>>> and his oven turns off.
>> 
>> Why is <http://dbooth.org/x> used as a variable for the on/off state
>> of both the porch light and the oven?
> 
> Because Alice and Bob acted independently, without knowledge of each other.  Alice and Bob each interpreted the meaning of the term db:x differently, each one perfectly consistent according to his/her knowledge at the time, and each one completely consistent with the term's definition.
> 
> (I have not shown the definition of db:x, but it does not really matter, because we already know that virtually any satisfiable definition will admit multiple interpretations.  If you want, for simplicity you could assume that the definition is the empty graph, which admits any interpretation.)
> 
>> 
>> 
>>> It is perfectly reasonable and natural to ask "What resource does
>>> db:x denote in G1?", and it is reasonable and natural to ask the
>>> same of G2.  The RDF Semantics (along with OWL) tells us that in G1
>>> db:x denotes whatever ex:on denotes, whereas in G2 db:x denotes
>>> whatever ex:off denotes.   That is useful!  Furthermore, the
>>> semantics tells us that if we merge those graphs then we have a
>>> contradiction -- there are no satisfying interpretations for the
>>> merge -- and that is useful to know also, because it means that
>>> Alice and Bob's graphs **cannot be used together**.
>> 
>> Is this contextual interpretation limited to terms in the subject or
>> object position? Am I licensed, for instance, to presume that the
>> rdf:type predicate is used to assert the type in both
>>   G1: { ex:alicePorchLight rdf:type ex:lightSwitch }
>> and
>>   G1: { ex:bobOven rdf:type ex:ovenSwitch }
>> ?
> 
> I don't know what you mean by "contextual interpretation", because I am not talking about context-sensitivity in the usual sense.

Actually, you are. You are using graphs as contexts, whether you intend this or not. That is the only way to make sense of the phrasing "refer in a graph".

>  I am talking about following the RDF Semantics rules for determining the satisfying interpretations of a graph, i.e., the set of interpretations that satisfy the graph, as defined in the RDF Semantics:

If you do this correctly and carefully, then your example does not illustrate anything about RDF semantics beyond the rather obvious fact  that these two graphs are inconsistent with one another. I have spelled this out in other replies to your messages. 

> http://www.w3.org/TR/2013/WD-rdf11-mt-20130723/#dfn-satisfies
> It is not limited to terms in the subject or object position.
> 
>> 
>> 
>>> Furthermore, the RDF Semantics notion of an interpretation maps well
>>> to real life applications: in effect, an application chooses a
>>> particular interpretation when it processes RDF data.  This is a
>>> very useful aspect of the model theoretic style of the semantics.
>>> In this example, Alice's home control app interpreted db:x to denote
>>> "on" and Bob's home control app interpreted it to denote "off".  And
>>> *both* were correct (in isolation): they both did The Right Thing.
>>> 
>>> In short, I think the above statement needs to be qualified somehow,
>>> such as:
>>> 
>>>   "IRIs are *intended* to have global scope: Two different
>>>   appearances of an IRI are *intended* to denote the same resource."
>>>   (However, the RDF Semantics explains how an IRI may denote
>>>   different resources in different interpretations.)
>> 
>> Do we have another class of documents where IRIs really do have global
>> scope?
> 
> This has nothing to do with different classes of documents.
> 
>> In order to make use of the data in docs that have local scope,
>> is there some shared identifier for the scope, or are these docs
>> really islands unto themselves?
> 
> I don't know of any shared identifier for the scope.  But this is not about scope in the usual sense of the term "scope" anyway.

Why not? Aren't you exactly using the containing graph as a scope for the IRI interpretations? 

> 
> This is about the interaction between graphs and interpretations in the RDF Semantics.  (I am talking here about interpretations in the RDF Semantics or model theory sense of the term -- not the generic English sense.)
> 
> Any graph has a set of satisfying interpretations according to the standard semantic rules defined in the RDF, OWL, etc. specifications. Different interpretations may map the same IRI to different resources. This is just RDF Semantics 101 (or model theory 101): there is nothing new or tricky about this.

Note however that no interpretation maps different occurences of the same IRI (eg in different graphs) to different resources. And two different interpretations may map a *single* occurrence of an IRI in a *single* graph, to *different* resources. 

> When people write RDF graphs, they make assumptions about what each IRI denotes.  In essence, each author has a set of interpretations in mind when he/she uses an IRI in a graph.  Each author's graph may be consistent with that author's set of assumed interpretations (i.e., the graph may be true under those interpretations) and with the definitions of all terms used in the graph, but different authors may have different interpretations in mind, even when using the same IRI.

Yes. That is the all-pervasive ambiguity referred to in point 1 above. 

>  Those graphs may work fine as long as they are kept separate.

Separateness has nothing to do with the semantics. There are theorems to this effect, about unions and merges, in the Semantics document, in case you want to argue the point. A set of graphs is semantically indistinguishable from the single graph comprising their union. 

>  But if one combines two or more of those graphs, a contradiction may be discovered, because the author of one graph took db:x to denote ex:on (whatever that is) and another author took it to denote ex:off .

All this is quite correct. As you say, contradictions may be discovered when separate pieces of RDF are put together. (Note: *discovered*. They were contradictions, in fact, even before this was discovered.) One might indeed reasonably assert that being able to detect such contradictions is the main purpose of having a semantics in the first place. 

But you seem to think that all this somehow comprises a different "perspective" on model theory, that it shows how the current RDF semantics can support the claim that different occurences of the same IRI may denote different things simultaneously, that it makes sense of the idea that an IRI can denote something "in" an RDF graph, and that these insights require that the current specifications must be re-worded. None of which follows from what I have agreed is all quite correct in the above paragraph. 

Pat

> 
> I don't know if I have answered the questions that you meant.  If not, please clarify and I'll try again.
> 
> Thanks,
> David
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes

Received on Friday, 4 October 2013 06:14:49 UTC