Re: owl:sameAs - Is it used in a right way? from Pat Hayes on 2013-03-18 (public-semweb-lifesci@w3.org from March 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Sun, 17 Mar 2013 23:10:05 -0500
To: Jim McCusker <mccusj@rpi.edu>
Cc: David Booth <david@dbooth.org>, Jeremy J Carroll <jjc@syapse.com>, Umutcan SIMSEK <s.umutcan@gmail.com>, Kingsley Idehen <kidehen@openlinksw.com>, w3c semweb HCLS <public-semweb-lifesci@w3.org>
Message-Id: <0513B5CC-E294-42E9-8407-556135105A88@ihmc.us>
On Mar 16, 2013, at 11:34 PM, Jim McCusker wrote:

> Hmm. In the end, all three of them are talking about the same apple. Either a) the apple changed (they do that), or b) someone got it wrong (Is a McIntosh a red apple or green apple? It's kind of both). 
> 
> This of course goes to my general assertion that most of the time, disjointness assertions are more likely to be wrong than right, but this isn't about that. There is an apple, and all three people agree they are talking about the same apple. It may have changed, or someone was color blind, or looking at a colorized black and white photo when they decided what color it was.

Right. There can be conflicting information about a single thing. 

> This is, more than anything, why, unless you know that the referent is that same AND the contextual scope is the same

Um... what is a "contextual scope", exactly? Or, indeed, inexactly? 

> , it's better to mint your own URI and link out using altOf and specOf, rather than making assertions using someone else's resource.
> 
> Jim
> 
> 
> On Sun, Mar 17, 2013 at 12:20 AM, David Booth <david@dbooth.org> wrote:
> Hi Jim,
> 
> 
> On 03/16/2013 12:37 PM, Jim McCusker wrote:
> I'm not terribly interested in a Humpty Dumpty interpretation of the web
> of data.
> 
> Well, you'd better get used to it, because that interpretation is standard RDF Semantics.  I don't think it's going away any time soon.
> 
> 
> That's part of the motivation for having global identifiers
> like URIs/URLs.
> 
> Exactly!  That's why the idea that "a URI identifies one resource" is "a good goal, and helpful as a guide to URI users", even though it is not actually true.
> 
> 
> There's no point in merging ANY graphs under this view,
> since you have no way of knowing if the referents are the same.

Yes, quite.

> Not true!  Don't throw the baby out with the bath.  When you merge graphs, you force the referents to be the same.  Sometimes the merge works fine, and sometimes the merge becomes inconsistent.

The merge always 'works'. Any set of RDF graphs entails its merge.  When the merge is inconsistent, it reveals that the original data was inconsistent. 

>  Just because you cannot *always* merge two graphs without causing inconsistency does not mean that merging is pointless.  It just means that *some* graphs can be merged and others cannot.

No, you can *always* merge graphs. For what it reveals. see above. 

>   That is only a problem if your expectations of being able to merge any two graphs are set unrealistically high.

There is a proof in the 2004 RDF Semantics document that any set of RDF graphs entails its merge. How higher can you get?

> I'm not
> saying that people don't denote different things with the same URI, I'm
> saying that, by using a URI that someone else controls, you are
> accepting their denotation of it.

Exactly. 

> You're preaching to the choir on that one!  I certainly agree with that architecture, but that is only part of the story.  The problem is that there is inherent ambiguity about the resource that a URI denotes.  This is inescapable.  And it means that two different, well-intentioned RDF authors can reasonably interpret a URI's resource identity differently

That also is true, but...

> , and those differences can cause conflicts to show up when their graphs are merged.

...if they differ that much, then this goes beyond mere (and unavoidable) ambiguity: it means they genuinely *disagree*, openly enough for this disagreement to be revealed by RDF machinery. 

> As a simple example, suppose Owen, a URI owner, mints a URI :apple to denote an apple.  As the URI's owner, he defines the URI's resource identity using the following RDF statements:
> 
>   # Owen's definition of :apple
>   @prefix : <http://example/owen/> .
>   :apple a :Apple .
> 
> Arthur, a URI author, then publishes his own RDF statements about Owen's apple (standard prefix definitions omitted for brevity):
> 
>   # Arthur's statements about Owen's apple
>   @prefix : <http://example/owen/> .
>   :apple a :GreenApple .
>   :GreenApple rdfs:subClassOf :Apple .
> 
> Note that Arthur's statements are entirely consistent with Owen's definition of :apple .
> 
> Now Aster, another URI author, also publishes some RDF statements about Owen's apple.  She also uses Owen's apple definition, but has no knowledge of Arthur's statements.  Aster writes:
> 
>   # Aster's statements about Owen's apple
>   @prefix : <http://example/owen/> .
>   :apple a :RedApple .
>   :RedApple rdfs:subClassOf :Apple .
>   :RedApple owl:disjointWith :GreenApple .
> 
> Note that Aster's statements are also consistent with Owen's definition of :apple.
> 
> Finally, Connie, an RDF consumer, discovers Arthur and Aster's graphs and wishes to merge them.  Unfortunately, the merge is inconsistent,

Why unfortunately? Arthur and Aster apparently disagree with each other, and the inconsistency simply reveals that disagreement. That is a useful datum if you are trying to figure out who you might want to believe.

> It is tempting to assume that someone did something "wrong" here.  For example, one might claim that Owen's definition was ambiguous, or that Arthur and Aster should not have made assumptions about the color of Owen's apple if Owen did not state the color in his definition.  Indeed, in this simple example it is easy to see where the conflicting assumptions crept in.  In real life, when you're dealing with thousands or millions of RDF statements, it is usually far more subtle.

True, but that does not change the essentials. 
> 
> One might also assume that color is an intrinsic property of the apple, and hence is somehow different from other properties that one might assert.  Imagine instead that Arthur had stated ":apple a :GoodFruit" and Aster had stated ":apple a :BadFruit" (assuming :GoodFruit owl:disjointWith :BadFruit).  The result would have been the same when Connie attempted to merge their graphs.  Since, AFAIK, there is no objective way to distinguish between intrinsic properties and non-intrinsic properties, the color example should suffice.

You might assert that color is an owl:functionalProperty. That would do the trick.

> The real problem is that *any* time you make an RDF statement about a resource, and that statement goes beyond what was said in the resource definition, you further constrain the identity of that resource, whether you mean to or not.  I.e., you further constrain the set of satisfying interpretations.

Yup, exactly. 

> I submit that neither Owen nor Arthur nor Aster did anything fundamentally wrong.  Owen was not wrong, because it is fundamentally impossible for Owen to be completely unambiguous about :apple's resource identity.  And Arthur and Aster did nothing fundamentally wrong, because: (a) they simply made statements about :apple

Did they know that these statements were true? They can't both have known this.

> ; and (b) AFAICT there is no fundamental difference between statements that constrain a resource's identity and any other statements about that resource.  In RDF semantics, they all simply add constraints to the possible interpretations.
> 
> The problem is just that Arthur and Aster happened to (unknowingly) make conflicting statements about :apple .  There's no need to cry foul here.

Its not logical-RDF-AWWW foul, but it is bad form to make assertions that you are claiming to be true (and other readers might rely on) when you don't have a clue if they are true or not. At least one of Arthur or Aster must be being this careless. 

>  We just have to learn to live with this possibility.  And one good technique is what Jeremy suggested: keep different perspectives in different graphs, and only join them if you need to.

True, provided you can make sense of what a perspective is.

Pat

> 
> David
> 
> 
> 
> -- 
> Jim McCusker
> Programmer Analyst
> Krauthammer Lab, Pathology Informatics
> Yale School of Medicine
> james.mccusker@yale.edu | (203) 785-4436
> http://krauthammerlab.med.yale.edu
> 
> PhD Student
> Tetherless World Constellation
> Rensselaer Polytechnic Institute
> mccusj@cs.rpi.edu
> http://tw.rpi.edu

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 18 March 2013 04:10:36 UTC