Re: owl:sameAs links from OpenCyc to WordNet

Dan Brickley wrote:
> On 23/2/09 22:24, Mike Bergman wrote:
> 
>> David Baxter wrote:
> 
>>> We at Cycorp have been publishing owl:sameAs links from our OpenCyc
>>> concepts to WordNet synsets, e.g.
>>>
>>> <http://sw.opencyc.org/2008/06/10/concept/en/India> owl:sameAs
>>> <http://www.w3.org/2006/03/wn/wn20/instances/synset-India-noun-1>
>>>
>>> We've done so with the idea that the WordNet synset represents the
>>> same concept as the OpenCyc term (i.e. the South Asian country in this
>>> case), and contains further relevant information that complements what
>>> is available in OpenCyc, e.g.
>>>
>>> "is a member of OPEC" (OK, this one's of dubious value, but it might
>>> be useful if it were true)
>>> "is a member of the British Commonwealth"
>>> "is a part of Asia"
>>>
>>> However, WordNet also contains assertions about the "India" synset
>>> that seem strange to assert about the country, e.g.
>>>
>>> "is an instance of NounSynset"
>>> "contains WordSense 'Republic of India 1'"
>>>
>>> We'd like to know what the general feeling in the LOD community is
>>> about these links. Is there any precedent or consensus about the best
>>> way to link from ontologies such as OpenCyc's to WordNet? Is anyone
>>> finding these links useful and/or harmful?
>>>
>>> Thanks for any input.
>>
>> I've rolled back to your starting message since intervening comments
>> have unfortunately snipped out the essence of your question about
>> owl:sameAs.
> 
> Maybe we lack agreement on what the essence was!
> 
> Let me also again add this link from over the weekend that I
>> think is also germane:
>>
>> http://i9606.blogspot.com/2009/02/semantic-dissonance-in-uniprot.html
>>
>> As I understand the current OWL, "an owl:sameAs statement indicates that
>> two URI references actually refer to the same thing: the individuals
>> have the same 'identity.'" [1]. In logical terms, I understand this to
>> represent complete and total identity, equivalent to the '='
>> relationship, or something pretty doggone close to it. I also understand
>> this property to perhaps have the strongest entailment of any OWL 
>> property.
> 
> Yup, owl:sameAs is for when there's only one thing, not two similar or 
> related things.
> 
>> The inference from your use case and the similar issue with Ben's
>> uniprot example are all too typical of sameAs problems once disparate
>> datasets actually get pulled together.
>>
>> I appreciate the rdf:seeAlso suggestion; it is the most common fallback.
>> But the issue with that one, which is why you went to sameAs in the
>> first place, is that seeAlso is way too weak to convey the nature of the
>> relationship. Sure, we could do a subPropertyOf but we could at best
>> capture only the very weak semantics that seeAlso presently provides; we
>> could not strengthen it.
>>
>> I think the real issue is that we don't have a readily available (or at
>> least accepted) predicate. I would suggest, though, that the issue at
>> hand is very much captured by the concept of "relative identity":
>>
>> http://plato.stanford.edu/entries/identity-relative/
>>
>> esp. Section 3 (though there are some wonderful paradoxes throughout).
>>
>> What I like about 'relative identity' is that we can still infer and
>> reason over the relationship (but *how* and weak or strong still is up
>> for grabs).
>>
>> I think the considerable experience of Cycorp in such matters could be
>> invaluable in severing this Gordian knot. Care to stroll deeper into the
>> den?
>>
>> A hasRelativeIdentity B ??
> 
> Interesting, but I think in this case we're talking about modeling some 
> lightweight linguistics data, and linking it to the classes the natural 
> language words are words for. Talk of identity is a bit of a distraction 
> here. This is due to the modeling style chosen for the W3C Wordnet RDF 
> representation, nothing more. If it were a class-centric projection of 
> Wordnet into RDF, we'd be having quite a different discussion.
> 

Au contraire!  The issue here is the predicate, sameAs, which itself has 
an identity assumption in its semantics.

While your earlier point was absolutely true about WordNet and its 
purpose as a language and linguistics model (and, thus, clearly not a 
knowledge base in the same vein as OpenCyc), the identity issue arises 
as soon as the OpenCyc world view attempts to establish a relationship 
"identity" with the conceptual linguistic view within WordNet.

The broader issue, still, is what is occurring via the simplest 
inference engines out there that are tracing sameAs links as if they 
were identity to entail all assertions through the sameAs linkages.

This is the fragile foundation that is creaking mightily as anyone tries 
to do any meaningful work with any of this linked data.

Do we just want to browse links for things that might be related (even 
there with no consensus), or do we want to do real stuff with this 
information?

Mike

Received on Monday, 23 February 2009 22:18:08 UTC