Re: owl:sameAs - Is it used in a right way? from David Booth on 2013-03-28 (public-semweb-lifesci@w3.org from March 2013)

From: David Booth <david@dbooth.org>
Date: Thu, 28 Mar 2013 14:40:16 -0400
To: Pat Hayes <phayes@ihmc.us>
CC: "public-semweb-lifesci@w3.org HCLS" <public-semweb-lifesci@w3.org>
Message-ID: <51548E90.4090507@dbooth.org>
Hi Pat,

On 03/26/2013 05:52 AM, Pat Hayes wrote:
> Hi David
>
> Sorry if I got a little personal back there, I was getting
> frustrated.

Thanks.  I can understand the frustration of trying to communicate with 
someone who looks at the world differently.  :)   [Insert joke here 
about multiple interpretations.]

>
> So, thinking over our emails and trying to understand what you were
> saying, I think I have it figured out. And your proposal is in fact
> (still not legal according to the RDF specs, but) not entirely daft.
> But you are expressing it wrong, which was why it seemed to be
> entirely daft. So I am going to say it right for you in this email.
> Don't mention it.

:)

>
> It's not the interpretations that you are suggesting to treat as
> contexts, it is the *graphs*. Your idea amounts to treating graphs as
> local contexts for the URI tokens that occur in them. And this is not
> a wholly insane idea, and indeed it has been suggested by several
> other people, including people on the current RDF WG. Some users use
> datasets in this way, treating the URI tokens in the various named
> graphs as functionally independent of one another until they have
> evidence that they are being used with the same meaning.

First of all, I was not making a proposal, I was making an *observation* 
about the *existing* RDF Semantics spec and the *existing* conforming 
use of that spec.  But to understand that observation, you need to be 
able to look at the RDF Semantics spec as a whole, and recognize that:

   (a) in essence, the RDF Semantics spec defines a standard
   function -- call it RS -- for determining the truth value
   of an <interpretation, graph> pair; and

   (b) the particular formal style in which that function was
   defined  -- in this case, a model theoretic style -- is
   **completely irrelevant** to the end result of its definition.

If you are unwilling/unable to acknowledge those two basic points, then 
indeed my observation will sound like an improperly stated proposal, and 
no amount of discussion will correct that appearance.

On the other hand, if you are with me so far, or if you can at least 
temporarily suspend your disbelief in those points, there are some 
fundamental observations that follow.

1. Applications correspond to interpretations.  (Well, technically they 
correspond to *sets* of interpretations, but for simplicity let's 
pretend an app corresponds to one interpretation.)  Think of an 
application as a function that maps a graph to an output.  When a 
conforming RDF app takes a graph as input, assumes that graph is true, 
and uses some algorithm (perhaps computing entailments) to produce some 
output based on that graph, in essence it has chosen an interpretation 
to apply to that graph.  That interpretation maps URIs to resources in 
that application's domain of discourse, whatever it may be.

2. Different applications choose *different* interpretations, because 
they have different purposes.  Thus, the *same* graph may contain a URI 
that maps to *different* resources in different 
applications/interpretations.

3. Different RDF authors sometimes use the same URI to denote 
*different* things.  I.e., different RDF authors make different 
assumptions about the interpretations that will/should be applied to the 
graphs that they write.  We may wish they didn't, but they do.  (And 
because ambiguity is inescapable, this is impossible to avoid, so 
there's no point in getting huffy about it.)  Fortunately, we can still 
use the RDF Semantics to help us determine the author's intended 
"meaning" of each graph by recognizing that different graphs require 
different interpretations!  To clarify, we can use the RDF Semantics to 
(separately) determine the entailments and satisfying interpretations 
for each graph.  This is useful!  And it cannot be done under the 
single-interpretation assumption.

4. Just because we can use the RDF Semantics to correctly determine the 
authors' intended "meaning" of two graphs individually -- i.e., 
determine each graph's entailments and satisfying interpretations -- 
this does *not* mean that the *merge* of those two graphs will be 
useful.  Because if those two graphs have disjoint sets of satisfying 
interpretations, then there cannot be any satisfying interpretations for 
the merge, i.e., the merge is necessarily false.  In other words, even 
though an app may work perfectly on two graphs *individually*, it may 
*not* work on the merge of those two graphs.  This may be quite 
counter-intuitive to those who would assume that if an app works fine on 
one set of RDF data then it should also work fine on a superset of that 
data.

[Interesting sidebar: In essence, although the set of entailments 
increases monotonically as statements are added to an RDF graph, the 
usefulness of those entailments is *non*-monotonic, because the 
usefulness suddenly goes to zero if a statement is added that causes a 
contradiction (i.e., causes the graph to be necessarily false -- having 
no possible satisfying interpretations), because a false premise implies 
everything.  This has implications about the wisdom of including 
disjointness assertions in one's graphs.  If one's goal is to check for 
consistency and detect errors then disjointness assertions are helpful. 
  But if one's goal is to compute useful entailments, they are 
detrimental.  If the reader is wondering how entailments could be useful 
if they are based on data containing an un-detected "inconsistency", see 
the example in which a URI "ambiguously" denotes both a toucan and its 
web page:
http://lists.w3.org/Archives/Public/public-semweb-lifesci/2013Mar/0181.html 
]

5. This has direct consequences for owl:sameAs.  It explains why it is 
useful, as Jeremy suggested, to keep separate graphs that reflect 
different "perspectives" -- i.e., that assume different sets of 
interpretations -- and keep them separate from owl:sameAs assertions. 
This allows you to *choose* which URIs will be joined by owl:sameAs, and 
which RDF assertions will be used with them, by choosing which datasets 
to merge into the graph that you wish to use for a particular 
application, *without* requiring that the merge of all possible graphs 
be consistent, and *without* violating RDF Semantics.

I hope this helps to clarify the observations that I was trying to explain.

David Booth
Received on Thursday, 28 March 2013 18:40:46 UTC