Re: owl:sameAs - Is it used in a right way? from Pat Hayes on 2013-03-26 (public-semweb-lifesci@w3.org from March 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 26 Mar 2013 04:52:07 -0500
To: David Booth <david@dbooth.org>
Cc: Antoine Zimmermann <antoine.zimmermann@emse.fr>, Alan Ruttenberg <alanruttenberg@gmail.com>, Umutcan ŞİMŞEK <s.umutcan@gmail.com>, Jeremy Carroll <jjc@syapse.com>, Kingsley Idehen <kidehen@openlinksw.com>, "public-semweb-lifesci@w3.org HCLS" <public-semweb-lifesci@w3.org>
Message-Id: <BD300EF0-6A09-440E-AE19-D9BB2098B4C3@ihmc.us>
Hi David

Sorry if I got a little personal back there, I was getting frustrated. 

So, thinking over our emails and trying to understand what you were saying, I think I have it figured out. And your proposal is in fact (still not legal according to the RDF specs, but) not entirely daft. But you are expressing it wrong, which was why it seemed to be entirely daft. So I am going to say it right for you in this email. Don't mention it. 

It's not the interpretations that you are suggesting to treat as contexts, it is the *graphs*. Your idea amounts to treating graphs as local contexts for the URI tokens that occur in them. And this is not a wholly insane idea, and indeed it has been suggested by several other people, including people on the current RDF WG. Some users use datasets in this way, treating the URI tokens in the various named graphs as functionally independent of one another until they have evidence that they are being used with the same meaning. (The fact that the named graphs in a dataset share blank node IDs makes the idea that they don't share URis even more outlandish, but whatever.) Antoine Zimmerman suggested a semantics for datasets based on this idea, and it is very similar to what you have been saying: an interpretation of a dataset of named-graphs-as-contexts is simply a set of interpretations, one for each name, and each applied to its local named graph to determine the truthvalues of the triples in that graph. (See http://www.slideshare.net/PatHayes/rdf-with-contexts, slide 8.)

But this difference between ways of talking (graphs vs. interpretations as the contexts) is critically important. Contexts, if you believe them to exist, are meaningful things: they change how names are interpreted. In order to be treated coherently, there has to be some *syntactic* trigger or mark which conveys this meaning. In your (and Antoine's) case, this is the graph boundary: each separate (named, for Antoine) graph 'marks' a context.  Then, a semantic account of the meaning has to explain *how to interpret* this syntax, and your idea of one-interpretation-per-graph appears here as the interpretation rules for the graphs considered as contexts, and so a single context-interpretation is a set of RDF interpretations. You might say that this is unimportant, mere verbal variations, but it is the difference between doing semantics right and doing it wrong. If you do it right, you get the metatheory for free: the core ideas (interpretations, satisfaction, entailment, truth) all work out properly, the mathematics works properly, it all just *works*. Whereas if you do it wrong, it doesn't. (For example, the very basic definitions of interpretation and entailment break.)

Also, if you do it right, it clarifies several issues. First, that this model theory is genuinely different from the RDF 2004 model theory. Second, a possibly rather subtle point, that one way it is different is, it requires interpretations to apply not to URis as such but rather to particular tokens of URIs in graphs. Its a bit like saying we have to talk about occurrences of letters in sentences, rather than letters of the alphabet. Third, it forces one to be very clear (which 2004 RDF regrettably isn't) about just exactly what counts as an RDF graph. For example, do you want G1 and G2 to be independently interpreted when G1 is a subgraph of G2? (Probably not, but how do you exclude that case? What about where G1 and G2 have a common subgraph?) And forth, it clarifies the relationship of this semantics of RDF to the 2004 normative semantics, instead of getting them confusingly confused with one another. 

So, your proposal, said correctly, is: let's treat graphs as RDF contexts which locally define the meanings of the URI tokens in them. What I called the graph-local vision in that slideshow. And that idea is (still not in conformity to the RDF semantics, but) not completely barmy, and in fact might be a way to handle very dirty RDF data which is dangerous to merge without taking great care about possibly conflicting assumptions about URI references.  

Hope this helps.

Pat


On Mar 24, 2013, at 10:41 PM, David Booth wrote:

> Okay, finally zeroing in on the crux of the matter . . .
> 
> On 03/23/2013 12:49 AM, Pat Hayes wrote:
>> 
>> On Mar 22, 2013, at 10:30 PM, David Booth wrote:
>> 
>>> On 03/21/2013 01:02 PM, Pat Hayes wrote:
>>>> On Mar 20, 2013, at 9:58 PM, David Booth wrote:
>>>>> On 03/20/2013 12:04 AM, Pat Hayes wrote:
>>>>>> On Mar 18, 2013, at 4:04 PM, David Booth wrote:
>>>>>>> On 03/17/2013 10:02 PM, Pat Hayes wrote:
>>>>>>>> On Mar 16, 2013, at 11:26 PM, David Booth wrote:
>>>>>>> [ . . . ] But presumably that passage from Section 1.2
>>>>>>> means ". . . [the semantics simply assumes that ... a
>>>>>>> single URI reference can be taken to have the same meaning
>>>>>>> whenever it occurs] _in the *given* graph, i.e., the graph
>>>>>>> whose semantics are being determined_",
>>>>>> 
>>>>>> No, it means wherever they occur, period. If they occur in
>>>>>> several graphs, they all refer in the same way in all of
>>>>>> them.
>>>>> 
>>>>> Absolutely not.  That is only true for *one* interpretation.
>>>> 
>>>> Absolutely yes. It is true for all interpretations. There is no
>>>> interpretation which allows a single URI to refer in different
>>>> ways when it occurs in different graphs.
>>> 
>>> Uh-oh, there's that single-interpretation assumption creeping in
>>> again!
>> 
>> No assumption is creeping anywhere in that statement. I said, for ALL
>> interpretations.
> 
> The "single-interpretation assumption" that you keep making is to assume that one can only talk about **one interpretation at a time** when talking about whether a URI can "refer in different ways when it occurs in different graphs".  But as I pointed out below -- and you agreed -- in different interpretations the same URI *can* refer to different resources.
> 
>> 
>>> While it is true that there exists no *single* interpretation that
>>> allows a single URI to refer in different ways when it occurs in
>>> different graphs
>> 
>> There exists no interpretation which ... No need to say or emphasise
>> "single".
> 
> The emphasis was added to point out that you were making the single-interpretation assumption.  While it is true that "There exists no [single] interpretation which ...", it is NOT true that "There exist no *two* interpretations which ...".  And the example that I gave (directly below) demonstrates this.
> 
>> 
>>> , surely you would agree that under standard RDF Semantics:
>>> 
>>> There exist interpretations I1 and I2, RDF graphs G1 and G2, URI U
>>> and resources R1 and R2, such that I1 maps U to R1, I2 maps U to
>>> R2, and R1 != R2.
>> 
>> Of course.
>> 
>>> Therefore, under standard RDF Semantics:
>>> 
>>> 1. A URI can map to *different* resources in *different* graphs.
>>> (Proof sketch: Use I1 for one graph and I2 for the other.)
>> 
>> No, and your proof is faulty. You don't get to use an interpretation
>> for one graph and a different interpretation for the other.
> 
> Sure I do.  I just did, above, and you agreed with it!  Notice that the sentence "A URI can map to *different* resources in *different* graphs" does not mention or constrain the number of interpretations that may be used.  You (wrongly) assumed that I meant "under any *single* interpretation".  But I am specifically raising the point that it is useful to consider multiple interpretations.
> 
>> Each
>> interpretation is a mapping *on names*, not on graphs. GIven such an
>> mapping on names, the truth or falsity of all graphs is determined.
> 
> Yes, for all graphs relative to *that* interpretation.  But it does *not* determine the truth-value of a graph relative to some other interpretation.  I'm being picky about this because you keep slipping into the single-interpretation assumption, and I'm trying to point out that it is useful to be able to talk about more than one interpretation at a time, and to apply *different* interpretations to different graphs.
> 
>> 
>> What does it mean to "use" I1 on G1 and "use" I2 on G2?
> 
> It means to apply I1 to G1 and apply I2 to G2, i.e., to determine the truth-value of I1(G1) and I2(G2), per RDF Semantics.
> 
>> Each of I1
>> and I2 apply to all the names in both G1 and G2. The "combination"
>> you have sketched is not an interpretation mapping, so (a) what is
>> it? and (b) what relevance does it have to what we are talking about,
>> which is interpreting RDF graphs?
> 
> What "combination" do you mean?  I did not talk about combining either I1 and I2 nor combining G1 and G2.  I talked about considering the semantics of G1 and G2 *separately*.
> 
>> 
>> Here's the corrected version:
>> 
>> 1. Lemma. A URI maps to the same thing, no matter what graph it
>> occurs in. (Proof: Let I be an interpretation mapping and U a URI.
>> Then I maps U to I(U). The previous sentence did not mention graphs,
>> so it applies regardless of what graphs contain U. But I was
>> arbitrary, so this is true of any interpretation. QED.)
> 
> Of course that is true of any *one* interpretation -- i.e., when you make the single-interpretation assumption.  But that isn't the point. The point is that the RDF Semantics specifically permits multiple interpretations.  The spec defines a standard way to determine the truth-value of *any* <interpretation, graph> pair.  It is explicitly agnostic about the interpretations (and the graphs) that may be used.
> 
> You are of course free to limit *your* use of the RDF Semantics to the single-interpretation assumption.  That is your prerogative.  But it is *not* a requirement of the RDF Semantics specification.  Given n interpretations and n graphs, it is perfectly valid to use the RDF Semantics to determine the truth-values of each of those n graphs relative to those n interpretations, without in any way violating the spec.
> 
>> 
>>> 2. A URI can map to *different* resources. (Proof sketch: Use
>>> different interpretations, I1 and I2.)
>> 
>> In different interpretations, yes, of course.
>> 
>>> Some may claim that this
>> 
>> What is "this" ?
> 
> This way of using the RDF Semantics spec.
> 
>> 
>>> is a misuse of the RDF Semantics -- that there really is only one
>>> "correct" interpretation, even though we may not know which one it
>>> is.  (This is the single-interpretation assumption.)  But the RDF
>>> Semantics makes no such requirement, and to my mind that is part of
>>> its genius, because the allowance of multiple interpretations is
>>> valuable!  It lets us better account for the real world use of RDF
>>> -- **under standard RDF Semantics** -- because in the real world,
>>> RDF authors *do* make different assumptions in their
>>> interpretations of URIs, and different RDF consumers *do* apply
>>> different interpretations to the URIs they encounter.
>> 
>>> I have separately tried to point out how the existing RDF Semantics
>>> spec already supports a poor person's notion of context *because*
>>> of its allowance of multiple interpretations:
>>> http://lists.w3.org/Archives/Public/public-semweb-lifesci/2013Mar/0099.html
>> 
>>> 
>> But your argument there is faulty, because the presence of multiple
>> interpretations does not, by itself, provide any notion of context,
>> poor or otherwise.
> 
> It does provide a very simple notion of context, as explained below -- not a fully featured notion of context that one would normally expect when one talks about having a notion of context.  That's why I called it a "poor person's" notion of context.
> 
>> 
>>> The purpose of a context is to enable the same RDF graph to have
>>> different truth-values in different contexts.
>> 
>> Actually, more basically, it is to enable a single URi to refer to
>> different things in different contexts. Which then has the graph
>> consequence that you mention.
> 
> Fine, use that definition instead.  The consequence is the same.
> 
>> 
>>> But this is exactly what an interpretation does!
>> 
>> No, that is not what an interpretation does.
> 
> Surely you would agree that interpretations enable a single URI to refer to different things in different interpretations.  Therefore, if a context *is* an interpretation, then a context would enable a single URI to refer to different things in different interpretations.  And that is exactly how you described the purpose of a context.
> 
>> An RDF interpretation
>> maps each URI into a single referent (and each graph to a single
>> truth-value.). It does not provide for one URi to have different
>> meanings in different contexts, because it does not provide any
>> contexts to pur URIs into. To do that, it would have to be a mapping
>> from (URIs x contexts) to referents.
> 
> Correct.  A *single* interpretation does not provide for one URI to have different meanings in different contexts.  But if a context *is* an interpretation, and multiple interpretations are considered, then those interpretations *do* provide a way to map one URI to different meanings in different contexts/interpretations.
> 
>> 
>>> So if you think of an RDF interpretation as a context -- which IMO
>>> is quite a natural way to think of it
>> 
>> I am afraid that if this seems natural to you, then you really have
>> not understood the basic idea of model-theoretic semantics. (Ask
>> yourself: if interpretations are contexts, how does one provide a
>> semantics for an actual context logic, such as ICL or Cycl, which has
>> names for contexts in the syntax of the language?
> 
> As I said below, the RDF Semantics would have to be extended to do things like that.
> 
>> And if
>> interpretations are contexts, then all logics ever invented have been
>> context logics, so why did context logicians feel that they had any
>> need to invent them again? And what did Guha get his doctorate for,
>> if all the work he did was already somehow inside the standard Tarski
>> model theory?)
> 
> Because people often want to do more things with contexts than merely considering the semantics of n graphs separately.  They want to be able to consider the semantics of the *merge* of two graphs, for example, and that indeed does require more than just interpretations.
> 
>> 
>>> -- then different RDF graphs can *already* be interpreted in
>>> different contexts (i.e., according to different RDF
>>> interpretations) under the *existing* RDF Semantics, because as we
>>> have laboriously agreed, **the existing RDF Semantics allows
>>> different interpretations to be applied to different RDF graphs**.
>> 
>> I hope I did not agree to that as stated. The semantics allows for
>> different interpretations, and it defines the truth of a graph in an
>> interpretation. Each interpretation determines the truth-value of
>> *all* graphs. The semantics does not mention, and does not provide
>> any way to make sense of the idea of, applying one interpretation to
>> one graph and a different interpretation to another graph. That idea
>> is a pure figment of your imagination.
> 
> The semantics does not have to mention how to do that.  If I define a function f from two integers, f(x,y) = x+y, I do not have to "provide a way" for you to apply it to multiple <integer, integer> pairs.  You are perfectly free to do so!
> 
> Similarly, in essence the RDF Semantics specification defines a function -- call it RS -- that, given any interpretation I and any RDF graph G, determines the truth-value of I(G).  The definition of RS does not have to give anyone permission or "provide a way" to apply that function to multiple <interpretation, graph> pairs.  RS may be applied to *any* number of interpretation-graph pairs, in full conformance with its definition.
> 
>> 
>>> [Actually, to be slightly more technical, in this approach to
>>> contexts it would be better to view a context as a *set* of
>>> interpretations, rather than a single interpretation, because it is
>>> still useful to talk about the set of satisfying interpretations
>>> for an RDF graph, subject to a particular context.  But that's an
>>> unimportant detail at the moment.]
>>> 
>>> Indeed, people *already* use RDF graphs in this way: intentionally
>>> keeping graphs separate if they come from different perspectives,
>>> sources, provenance, etc., *because* the graphs may cause
>>> inconsistencies or lead to incorrect conclusions if merged
>>> injudiciously.
>> 
>> That is true, but...
>> 
>>> Knowingly or not, different interpretations are being used for
>>> different graphs.
>> 
>> ...that does not follow. Consider: if the URIs referred differently
>> in the various graphs, then these graphs would *not* cause
>> inconsistencies if taken together. The inconsistencies arise
>> precisely *because* we all assume that a given URI denotes the same
>> thing in every graph in which it occurs, and that interpretations
>> apply across graphs.
> 
> When the graphs are merged then the assumption is made that a given URI denotes the same thing in all of the graphs that are being merged, yes.  But different interpretations can and often are used for different graphs when those graphs are *not* merged.
> 
>> 
>>> A simple example is Ian Davis's famous toucan-versus-its-web-page
>>> example,
>>> http://blog.iandavis.com/2010/11/04/is-303-really-necessary/ in
>>> which the same URI "ambiguously" denotes both a toucan and the web
>>> page describing that toucan.  One RDF graph, Gt, may be written
>>> under the assumption that the URI denotes the toucan.  Another
>>> graph, Gp, may be written under the assumption that the URI denotes
>>> the toucan's web page.  Gt may work perfectly well in an
>>> application that merely categorizes different animal species --
>>> *unambiguously* interpreting the URI as denoting the toucan.  And
>>> Gp may work perfectly well in an application that merely lists web
>>> page authors -- *unambiguously* interpreting the URI as denoting
>>> the toucan's web page.  In fact, even the merge of Gt and Gp may
>>> work perfectly well in both applications, provided the RDF authors
>>> have not asserted that toucans are disjoint from web pages!
>> 
>> This is called punning, AKA overloading.
> 
> Well, no, that isn't what I meant.  But it isn't worth pursuing.  I think I've made my points enough above.
> 
> David
> 
>> As you point out, it works
>> up to a point. But this is not context reasoning. In fact, it is
>> might be described as un-context reasoning: it is what happens when
>> you mush what should be distinct contexts into a language which does
>> not have a context mechanism to distinguish them. If this were
>> written in a real context logic, then you would have explicit
>> contexts for the two distinct meanings.
>> 
>>> Applications that consume RDF and faithfully follow the RDF
>>> Semantics are free to choose their own interpretations, and this is
>>> A Good Thing.
>>> 
>>> On the other hand, an application that needs to distinguish between
>>> web pages and animal species will find this toucan/webpage URI
>>> hopelessly ambiguous, and will not be at all happy with the merge
>>> of Gt and Gp. This is why I have been pointing out (in other
>>> conversations) that ambiguity is *relative* to the application: a
>>> URI may be unambiguous to some applications, but ambiguous to
>>> others.
>>> 
>>> But although the existing RDF Semantics does support this simple
>>> "poor person's" notion of context-as-interpretation (or
>>> context-as-set-of-interpretations), it does not support other basic
>>> features that one would expect in a context-aware semantics.  Most
>>> notably, it does not support the ability to retain contextual
>>> differences when RDF graphs are merged.  To account for that and a
>>> few other basic operations, the RDF Semantics would indeed have to
>>> be extended, as you and a few others have proposed.
>>> 
>>> [ . . . ]
>>>>>> The semantic rules simply specify when a graph (any graph)
>>>>>> is true in an interpretation (any interpretation). But
>>>>>> interpretations are not defined "on" graphs: they are
>>>>>> mappings from *names* to things. That does not mention graphs
>>>>>> at all. So to then start talking about one graph in one
>>>>>> interpretation and another graph in another interpretation
>>>>>> simply misses the point. The fact that there are many
>>>>>> possible interpretations is a reflection of the fact that we
>>>>>> typically are in a state of doubt about what the names (URIs)
>>>>>> actually refer to.
>>>>> 
>>>>> That sounds dangerously close to falling into the trap of
>>>>> assuming that there really is only one, global, correct
>>>>> interpretation.  And it is *my* interpretation, of course.  ;)
>>>> 
>>>> It is not anyone's interpretation: it is the fact of the matter.
>>>> Of course, we don't have access to the facts, only our
>>>> representations of them, which allow many interpretations.
>>> 
>>> I don't see how this "fact of the matter" has any relevance
>> 
>> I probably shouldn't have mentioned it, as it seems to have been more
>> confusing than enlightening. (It is the standard way to understand
>> model theory when discussing mathematics, eg people talk about
>> "standard arithmetic", meaning the single "correct" way to interpret
>> formal arithmetics.)
>> 
>> Pat
>> 
>> 
>>> , since: (a) the RDF Semantics says nothing about it; and (b) it is
>>> an application's own business what interpretations it chooses to
>>> use.  If that application causes some harm by applying the wrong
>>> interpretation, then the owner may be liable, but that is an
>>> entirely separate issue from the question of compliance with the
>>> RDF Semantics.
>>> 
>>> David
>>> 
>>> 
>>> 
>>> 
>> 
>> ------------------------------------------------------------ IHMC
>> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>> (850)202 4416   office Pensacola                            (850)202
>> 4440   fax FL 32502                              (850)291 0667
>> mobile phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 26 March 2013 09:52:40 UTC