Re: owl:sameAs - Harmful to provenance? from David Booth on 2013-04-08 (public-semweb-lifesci@w3.org from April 2013)

From: David Booth <david@dbooth.org>
Date: Mon, 08 Apr 2013 14:42:43 -0400
To: Oliver Ruebenacker <curoli@gmail.com>
CC: Pat Hayes <phayes@ihmc.us>, Peter Ansell <ansell.peter@gmail.com>, Alan Ruttenberg <alanruttenberg@gmail.com>, public-semweb-lifesci <public-semweb-lifesci@w3.org>
Message-ID: <51630FA3.60009@dbooth.org>

Hi Oliver,

On 04/08/2013 11:55 AM, Oliver Ruebenacker wrote:
>
>       Hello David, all,
>
>    What I hear you saying is primarily that:
>
>    1. It is possible to have sets of assertions such that each set is
> consistent, but the union is contradictory.

Yes, even with all parties acting intelligently and in good faith.

>    2. If I don't know the meaning of these assertions, I can't prove
> that they are unjustified.

That may be true enough, but that's *not* what I'm saying.  If by 
"meaning" you are referring to someone's real world interpretation, then 
I do not see the meaning or justification of assertions as being at all 
relevant to the technical question of whether owl:sameAs was misused. 
The RDF Semantics is explicitly agnostic about what any assertions 
"mean".  AFAICT, what they might "mean" is subjective or relative: it 
all depends on the application in which they are used.

>
>    That's pretty obvious. It has nothing to do with RDF. It is true for
> any sufficiently powerful way of making assertions (e.g. natural
> language, math, type declarations in programming languages, ...).

I suspect that's true, but our present context is RDF / Semantic Web.

>
>    It is therefore at best misleading to point to these issues and talk
> about them as if they were an artefact of the RDF specs.

I never intended to imply that these issues are *unique* to RDF.  I 
apologize if I gave that impression.  But they are direct consequences 
of the RDF specs.

>
>    So what most people here are saying is that before we can do anything
> useful, we need to make sure that if two assertions use the same
> reference, they mean the same thing.

That would be an impossibly high bar in the vast majority of cases.  It 
is impossible to ensure that two well-meaning RDF authors, acting 
independently without knowledge of each other, will use a URI to mean 
exactly the same thing.  Heck, misunderstandings occur even between 
parties that know each other well and communicate directly!  We cannot 
wait to reach nirvana before we start using this stuff.

>
>    To which you respond that you will accept assertions without assuming
> that same references mean same things. You will just keep them separate.
> There is no rule against that.

Yes!  But first let me reiterate that I *agree* with the goal that a URI 
should always denote the same thing.  I am merely pointing out the 
inherent impossibility of reaching this goal, so that we can better 
understand what's going on and learn how to deal with it.

>
>    But in what way is this useful?

The key, to my mind, is to recognize that although a URI's resource 
identity may never be completely unambiguous, its ambiguity can be 
*bounded*.  That is precisely what a URI definition does.  And we can 
still use that URI for writing useful RDF datasets.  Sometimes the URI 
owner will, by luck or good judgement, choose a URI definition such that 
RDF datasets based on that URI, and consistent with its definition, will 
often merge without conflict.  And sometimes not.

There is also an inherent balancing act at play, because the more 
precise a URI definition is -- i.e., the more constraints that are 
imposed -- the less flexible and reusable it is.  So although a more 
constrained URI definition may prevent some downstream conflicts between 
RDF datasets, it may do so at the expense of some applications that 
would have otherwise been able to use that URI if its definition had 
been looser.

Thus, we need to learn to make good trade-offs between tight and loose 
URI definitions.  And we need to learn to accept and deal with conflicts 
when they occur instead of assuming that someone did something wrong and 
rushing to blame.  Just because a particular RDF merge that *you* want 
to use for *your* application does not work, that does *not* necessarily 
mean that someone else did anything wrong.  It may just mean that they 
had different use cases in mind.

David Booth

Received on Monday, 8 April 2013 18:43:15 UTC