W3C home > Mailing lists > Public > semantic-web@w3.org > May 2008

Re: Managing Co-reference (Was: A Semantic Elephant?)

From: Frederick Giasson <fred@fgiasson.com>
Date: Thu, 15 May 2008 09:10:23 -0400
Message-id: <482C363F.2000201@fgiasson.com>
To: Yves Raimond <yves.raimond@gmail.com>
Cc: Aldo Gangemi <aldo.gangemi@cnr.it>, Richard Cyganiak <richard@cyganiak.de>, Michael F Uschold <uschold@gmail.com>, Tim Berners-Lee <timbl@w3.org>, Sören Auer <auer@informatik.uni-leipzig.de>, Semantic Web Interest Group <semantic-web@w3.org>, Chris Bizer <chris@bizer.de>, Frank van Harmelen <Frank.van.Harmelen@cs.vu.nl>, Kingsley Idehen <kidehen@openlinksw.com>, "Fabian M. Suchanek" <f.m.suchanek@gmail.com>, Tim Berners-Lee <timbl@csail.mit.edu>, Jim Hendler <hendler@cs.rpi.edu>, Mark Greaves <markg@vulcan.com>, georgi.kobilarov@gmx.de, Jens Lehmann <lehmann@informatik.uni-leipzig.de>, Michael Bergman <mike@mkbergman.com>, Conor Shankey <cshankey@reinvent.com>, Kira Oujonkova <koujonkova@reinvent.com>

>> Agreed. I do not want to be picky about that: SW is Web, and errors are
>> life.
>> Just there is no need to use owl:sameAs in many cases, and at least in LOD
>> large projects, this can be avoided easily.
> Sorry to jump in the middle of this discussion, but I don't
> particularly agree with that. They are plenty of cases where they
> can't really be avoided, even in LOD large projects.
> For example, http://dbtune.org/jamendo/artist/5 and
> http://zitgist.com/music/artist/0781a3f3-645c-45d1-a84f-76b4e4decf6d
> identify the same artist. One of them in the Jamendo database, and one
> of them in Musicbrainz.
> Both databases hold *really* different type of information about these
> artists. Musicbrainz holds detailed editorial information (regardless
> of their publication in the Jamendo Creative Commons platform),
> information about the members of this band and their birth dates, etc.
> Jamendo holds actual audio items, and also a set of tags for each of them.
> As an URI is not only an identifier but also a way to access a
> specific representation, how could I use a single URI in this case? In
> other words, how would I avoid the owl:sameAs between the two?
> Different data sources make different claims about similar thing, and
> we need both a way to access these claims and to keep the cross-source
> identity. I think owl:sameAs is quite a nice way of doing that.

Exact; there is nothing bad with owl:sameAs in itself. It does what it 
has to do: asserting that two resources are *individuals* that have the 
same identity. The OWL spec is clear about that:

"The built-in OWL property owl:sameAs links an individual to an 
individual. Such an owl:sameAs statement indicates that two URI 
references actually refer to the same thing: the individuals have the 
same "identity"."

There is nothing wrong about using this property. Some times its the 
best property to use, other times less so.

Another good example where sameAs fits is: Yago and DBpedia.

Both data sources are a derivation of Wikipedia. Both describes the same 
individuals in different ways. However, can we assert that 
yago:Abraham_Lincoln doesn't have the same identity than 
dbpedia:Abraham_Lincoln? It would be hard considering that they are 
different RDF representations of the same HTML representation. However, 
is the *identities* of these *individuals* the same? I do think so yes; 
even if they don't share the same properties. In fact, we can't assert 
that they doesn't have the same identity because they don't share the 
same properties (the reality is that I know assertions about individual 
A and individual B based on the information I have in hands. I couldn't 
assert that I have *all* the information about these individuals; so I 
have to work with what I have). This said, the only thing we can say is: 
according to our knowledge, they don't share the same properties. 
However nothing tells us if these properties are not defined elsewhere. 
(thanks to the open-world assumption).

In some cases... there will be issues, and inconsistencies. Is it a 
problem? Sure it is. But has Richard said: we have to live with it. But 
what is great is that we can still create much value out of this 
inconsistancies :)

But be careful here. I am not saying that sameAs is good everywhere and 
for all situations. But it is certainly good where it fits! For the 
other cases, we created umbel:isLike for our own purposes (documentation 
& full ontology of UMBEL will be published soon; just stay tuned ;) )

My two cents

Take care,

Received on Thursday, 15 May 2008 13:13:15 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:42:04 UTC