W3C home > Mailing lists > Public > semantic-web@w3.org > May 2008

Re: Managing Co-reference (Was: A Semantic Elephant?)

From: Michael F Uschold <uschold@gmail.com>
Date: Thu, 15 May 2008 09:35:06 -0700
Message-ID: <406b38b50805150935r50dcaa2bx90a0359791bada62@mail.gmail.com>
To: "Frederick Giasson" <fred@fgiasson.com>
Cc: "Yves Raimond" <yves.raimond@gmail.com>, "Aldo Gangemi" <aldo.gangemi@cnr.it>, "Richard Cyganiak" <richard@cyganiak.de>, "Tim Berners-Lee" <timbl@w3.org>, "Sören Auer" <auer@informatik.uni-leipzig.de>, "Semantic Web Interest Group" <semantic-web@w3.org>, "Chris Bizer" <chris@bizer.de>, "Frank van Harmelen" <Frank.van.Harmelen@cs.vu.nl>, "Kingsley Idehen" <kidehen@openlinksw.com>, "Fabian M. Suchanek" <f.m.suchanek@gmail.com>, "Tim Berners-Lee" <timbl@csail.mit.edu>, "Jim Hendler" <hendler@cs.rpi.edu>, "Mark Greaves" <markg@vulcan.com>, georgi.kobilarov@gmx.de, "Jens Lehmann" <lehmann@informatik.uni-leipzig.de>, "Michael Bergman" <mike@mkbergman.com>, "Conor Shankey" <cshankey@reinvent.com>, "Kira Oujonkova" <koujonkova@reinvent.com>
FredG says:

Exact; there is nothing bad with owl:sameAs in itself. It does what it has
to do: asserting that two resources are *individuals* that have the same
identity. The OWL spec is clear about that:

"The built-in OWL property owl:sameAs links an individual to an individual.
Such an owl:sameAs statement indicates that two URI references actually
refer to the same thing: the individuals have the same "identity"."

There is nothing wrong about using this property. Some times its the best
property to use, other times less so.

No construct is good or bad, it is only useful or not in a given context. I
agree with Aldo's view that the way owl:sameAs was being used in some of the
extended DBpedia datasets will have undesirable consequences.  Anyone
loading this data will have to decide whether to change or remove these
assertions.

The musician example is a good one.  If you use owl:sameAs, then if you
query the properties of either URI, you get *all the properties of both URIs
* -- because owl:sameAs makes them both refer to the same individual.  If
that is the behavior you want, then owl:sameAs is a good choice. If not,
then using owl:sameAs will have undesirable consequences.

Its that simple. what is harder is:

   - getting people to understand exactly what the consequences are,
   - motivating them to carefully consider the consequences before they make
   a decision, and to document that decision.

PUNCH LINE:

   - owl:sameAs is a very powerful construct and can easily give undesirable
   results
   - people who use it should clearly understand what it means and what the
   inferential consequences of its use are
   - rational for its use should be documented if possible, especially if
   the author recognizes that it is a gray area and is being chosen as the
   lesser of two evils
      - less evil: use sameAs to link two closely related concepts
      - more evil: don't use sameAs, which may be technically the right
      thing to do, but then some things don't happen that you DO want.
      - hence, there may be a need for a less powerful construct than
   sameAs, but I can't thin what it might be, or what its inferential
   consequences should be.


Michael

On Thu, May 15, 2008 at 6:10 AM, Frederick Giasson <fred@fgiasson.com>
wrote:

> Hi,
>
>  Agreed. I do not want to be picky about that: SW is Web, and errors are
>>> life.
>>> Just there is no need to use owl:sameAs in many cases, and at least in
>>> LOD
>>> large projects, this can be avoided easily.
>>>
>>>
>>
>> Sorry to jump in the middle of this discussion, but I don't
>> particularly agree with that. They are plenty of cases where they
>> can't really be avoided, even in LOD large projects.
>> For example, http://dbtune.org/jamendo/artist/5 and
>> http://zitgist.com/music/artist/0781a3f3-645c-45d1-a84f-76b4e4decf6d
>> identify the same artist. One of them in the Jamendo database, and one
>> of them in Musicbrainz.
>>
>> Both databases hold *really* different type of information about these
>> artists. Musicbrainz holds detailed editorial information (regardless
>> of their publication in the Jamendo Creative Commons platform),
>> information about the members of this band and their birth dates, etc.
>> Jamendo holds actual audio items, and also a set of tags for each of them.
>>
>> As an URI is not only an identifier but also a way to access a
>> specific representation, how could I use a single URI in this case? In
>> other words, how would I avoid the owl:sameAs between the two?
>>
>> Different data sources make different claims about similar thing, and
>> we need both a way to access these claims and to keep the cross-source
>> identity. I think owl:sameAs is quite a nice way of doing that.
>>
>>
>
>
> Exact; there is nothing bad with owl:sameAs in itself. It does what it has
> to do: asserting that two resources are *individuals* that have the same
> identity. The OWL spec is clear about that:
>
> "The built-in OWL property owl:sameAs links an individual to an individual.
> Such an owl:sameAs statement indicates that two URI references actually
> refer to the same thing: the individuals have the same "identity"."
>
> There is nothing wrong about using this property. Some times its the best
> property to use, other times less so.
>
>
> Another good example where sameAs fits is: Yago and DBpedia.
>
>
> Both data sources are a derivation of Wikipedia. Both describes the same
> individuals in different ways. However, can we assert that
> yago:Abraham_Lincoln doesn't have the same identity than
> dbpedia:Abraham_Lincoln? It would be hard considering that they are
> different RDF representations of the same HTML representation. However, is
> the *identities* of these *individuals* the same? I do think so yes; even if
> they don't share the same properties. In fact, we can't assert that they
> doesn't have the same identity because they don't share the same properties
> (the reality is that I know assertions about individual A and individual B
> based on the information I have in hands. I couldn't assert that I have
> *all* the information about these individuals; so I have to work with what I
> have). This said, the only thing we can say is: according to our knowledge,
> they don't share the same properties. However nothing tells us if these
> properties are not defined elsewhere. (thanks to the open-world assumption).
>
> In some cases... there will be issues, and inconsistencies. Is it a
> problem? Sure it is. But has Richard said: we have to live with it. But what
> is great is that we can still create much value out of this inconsistancies
> :)
>
>
> But be careful here. I am not saying that sameAs is good everywhere and for
> all situations. But it is certainly good where it fits! For the other cases,
> we created umbel:isLike for our own purposes (documentation & full ontology
> of UMBEL will be published soon; just stay tuned ;) )
>
>
> My two cents
>
>
>
> Take care,
>
>
> Fred
>
>
>
>
Received on Thursday, 15 May 2008 16:35:47 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 21:45:22 GMT