Re: owl:sameAs - Harmful to provenance?

Dropping Jim from cc in deference to him finishing his defense.

On Wed, Apr 3, 2013 at 9:58 PM, David Booth <david@dbooth.org> wrote:

> On 04/02/2013 05:02 PM, Alan Ruttenberg wrote:
>
>> On Tuesday, April 2, 2013, David Booth wrote:
>>     On 03/27/2013 10:56 PM, Pat Hayes wrote:
>>         On Mar 27, 2013, at 7:32 PM, Jim McCusker wrote:
>>
>>             If only owl:sameAs were used correctly...
>>
>>         Well, I agree that is a problem, but don't draw the conclusion
>> that
>>         there is something wrong with sameAs, just because people keep
>> using
>>         it wrong.
>>
>>     Agreed.  And furthermore, don't draw the conclusion that someone has
>>     used owl:sameAs wrong just because you get garbage when you merge
>>     two graphs that individually worked just fine.  Those two graphs may
>>     have been written assuming different sets of interpretations.
>>
>> In that case I would certainly conclude that they have used it wrong.
>> Have you not been reading what Pat and I have been writing?
>>
>
> I've read lots of what you and Pat have written.  And I've learned a lot
> from it -- particularly in learning about ambiguity from Pat.  And I'm in
> full agreement that owl:sameAs is *often* misused.
>
> But I don't believe that getting garbage when merging two graphs that
> individually worked fine *necessarily* indicates that owl:sameAs was
> misused -- even when it appears on the surface to be causing the
> problem.


The word misuse is tricky here. If each individually acted without
knowledge of the other, what you describe can certainly arise. However that
doesn't change the fact that in the end someone is wrong. They both can't
be right because in the end the whole is judged by the same interpretation.
Now, if they are wrong in the end, figuring out the source of the mistake
is interesting. Perhaps in the end analysis someone would say misuse
someone else not. Certainly if the author of both graphs was the same,
calling it misuse would be justified.


> Here's a simple example to illustrate.
>
> Using the following prefixes throughout, for brevity:
>
>   @prefix :    <http://example/owen/> .
>   @prefix owl: <http://www.w3.org/2002/07/**owl#<http://www.w3.org/2002/07/owl#>>
> .
>
> Suppose that Owen is the URI owner of :x, :y and :z, and Owen
> defines them as follows:
>
>   # Owen's URI definition for :x, :y and :z
>   :x a :Something .
>   :y a :Something .
>   :z a :Something .
>
> That's all.  That's Owen's entire definition of those URIs.
> Obviously this definition is "ambiguous" in some sense.  But as
> we know, ambiguity is ultimately inescapable anyway, so I have
> merely chosen an example that makes the ambiguity obvious.
> As the RDF Semantics spec puts it: "It is usually impossible
> to assert enough in any language to completely constrain the
> interpretations to a single possible world".
>

That's fine. But I'll judge the actions that follow based on the fact that
the above is the only information known.


>
> Arthur, an RDF author, publishes the following graph, G1,
> making certain assumptions about the interpretations that will
> be applied to it:
>
>   # G1
>   :x owl:sameAs :y .
>

That's unjustified. There's just no basis for this assertion.


> Aster, another RDF author, publishes the following graph, G2,
> making certain other assumptions about the interpretations
> that will be applied to it:
>
>   # G2
>   :x owl:differentFrom :z .
>

As above.

>
> Alfred, a third RDF author, publishes the following graph, G3,
> making still other assumptions about the interpretations that
> will be applied to it:
>
>   # G3
>   :y owl:differentFrom :z .
>

As above.


> Note that G1, G2 and G3 are all individually consistent with
> Owen's URI definition.


Correct, however none of them are justified. I would call all the authors
irresponsible.


> Furthermore, G1, G2 and G3 are all
> pair-wise consistent: there exists at least one satisfying
> interpretation for the merge of each pair.  But the merge
> of G1, G2 and G3 is not consistent: Arthur, Aster and Alfred
> made different assumptions about the set of interpretations
> that would be applied to their graphs, and the intersection
> of those sets was empty.
>
> Did Arthur misuse owl:sameAs?


Yes.


> What if Aster never
> published G2?


Same answer.


> How could Aster's graph possibly affect the
> question of whether *Arthur* misused owl:sameAs?


It doesn't.


> It would be nonsensical to assume that it could.


Not really. What you are missing is that publishing out in the world, it
isn't responsible for folks to make willy nilly unjustified assumptions
like these. Just as we don't consider it socially acceptable to should
"fire" in a crowd without adequate cause, these assertions, without
adequate justification cause mayhem and damage of a different sort.


> What if Owen later
> said that Arthur was correct, that :x == :y ?  What if he
> later said the opposite?  Again, it would seem rather bizarre
> to say that the determination of whether Arthur had misused
> owl:sameAs could be changed -- long after Arthur had written
> G1 -- by Owen's later statements.
>

Arthur misused owl:sameAs because he had no justification for making the
assertion. It doesn't matter if he was right or wrong. He was shooting
arrows with a blindfold.


> One might claim that Arthur misused owl:sameAs because Owen
> had not specified whether :x was the same or different from
> :y or :z, and therefore Arthur had improperly *guessed* about
> the value of :x's owl:sameAs property.
>

That's what I claim.

But by that logic, Arthur would not be able to assert *anything*
> new about :x.  I.e., Arthur would not be allowed to assert
> any property whose value was not already entailed by Owen's
> definition!  And that would render RDF rather pointless.
>

I'm sorry, I'm not clever enough to make the leap you make here. If you
want to demonstrate that, please give a realistic example and we can talk
about that.


> Maybe someone can see a way to avoid this dilemma.  Maybe
> someone can figure out a way to distinguish between the
> "essential" properties that serve to identify a resource, and
> other "inessential" properties that the resource might have.
> If so, and the number of "essential" properties is finite,
> then indeed this problem could be avoided by requiring every
> URI owner to define all of the "essential" properties of the
> URI's denoted resource, or by prohibiting anyone but the URI
> owner from asserting any new "essential" properties of the
> resource (beyond those the URI owner had defined).  Or maybe
> there is another way around this dilemma.
>
> Unless some way around this dilemma is found, it seems
> unreasonably judgemental to accuse Arthur of misusing
> owl:sameAs in this case, since he didn't assert anything
> that was inconsistent with Owen's URI definition.


I don't judge as ok anything at all that isn't inconsistent with what's
logically asserted, as it appears you are suggesting should the criteria.

First, there's a connection between what's asserted and what is the case.
When documentation around a URI is good (and one should attempt to make it
so) then statements that are logically consistent but at odds with the
english documentation will be considered wrong. If the documentation is
lacking, or demonstrative of insufficient grasp of the subject matter, then
it's probably not worth working with the term in the first place.

-Alan

Received on Thursday, 4 April 2013 05:44:23 UTC