W3C home > Mailing lists > Public > semantic-web@w3.org > May 2008

Re: Managing Co-reference (Was: A Semantic Elephant?)

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Thu, 15 May 2008 14:37:44 +0200
Message-ID: <482C2E98.2000604@mondeca.com>
Cc: Semantic Web Interest Group <semantic-web@w3.org>

[To sw list only - assuming all cc's are on this list - if not, they 
should :)]

Richard, Aldo, and al.

I understand Richard might feel misquoted here, and I want to express 
some kind of solidarity here. Granted, we had to begin with a strong 
disagreement on this issue. An issue which, as some of you might be 
aware, has been my recurrent and somehow obsessive topic of reflexion 
for quite a time (hubjects and such). But we came to some kind of 
pragmatic consensus, along the lines correctly recalled by Richard here, 
and which are not clearly to use owl:sameAs ad libitum to indicate any 
kind of similarity.

That said for historical record of the debate, since the blog post 
quoted by Aldo, my reflexion on this has moved a few inches forward. 
Clearly I agree with Aldo that owl:sameAs is at risk to be used default 
proper vocabularies to express different levels of similarity, and that 
we need some expressivity between the absolute sameness of owl:sameAs 
and the absolute fuzziness of rdfs:seeAlso or skos:related.
In particular, what is needed is a way to express that two URIs are 
acknowledged to have somehow a similar conceptual or "real world" (for 
those believing in such a thing) referent, but represent aspects or 
views of this referent different enough as to possibly be logically 
inconsistent.

Two examples coming to mind, among many.

1. Berlin as populated place and /or administrative entity. Geonames 
defines those as distinct entities with distinct URIs and descriptions 
[1] [2], whereas DBpedia has only one Berlin entity and URI [3], 
consistent with the implicit semantics expressed in Wikipedia article :
*"Berlin is the capital city and one of sixteen states of Germany."*

2. A question has been brought a few days on SKOS forum about 
platypus-as-SKOS-concept vs platypus-as-DBpedia entity [4].
Using owl:sameAs in such a case is definitely a bad practice in the 
sense that merging will bring about inconsistent descriptions.

So we definitely need this "s" property when a:foo  and  b:bar identify 
"similar things".
a:foo     s    b:bar

Meaning : a:foo and b:bar identify things which can be considered 
identical at a certain level of semantic granularity, IOW they share 
some property-values pairs which can be used for identification in some 
contexts. But merging all available descriptions of a:foo and b:bar are 
likely to lead to inconsistent descriptions.

Bottom line : co-reference is a question of granularity. If you look 
closely, two different URIs have certainly almost never exactly the same 
referent, but allowing a certain fuzziness in the referent definition, 
information surrounding them can usefully be brought together. Whatever 
the mechanism.

Bernard


[1] http://sws.geonames.org/2950159/
[2] http://sws.geonames.org/2950157/
[3] http://dbpedia.org/page/Berlin
[4] http://lists.w3.org/Archives/Public/public-esw-thes/2008May/0010.html


Richard Cyganiak a écrit :
>
> Aldo,
>
> Please keep your facts straight.
>
> On 14 May 2008, at 22:24, Aldo Gangemi wrote:
>> owl:sameAs is great to co-reference persons, places, etc. It is buggy 
>> when used to relate e.g. foaf:Person
>> instances to persons' homepages,
>
> I would like to point out that I haven't come across any instance 
> where this has been done or encouraged.
>
>> or a city as from Cyc to a wikipedia article of that city (as done in 
>> DBpedia).
>
> DBpedia doesn't contain any owl:sameAs statements between Cyc 
> resources and Wikipedia articles.
>
> [snip]
>> It is reasonable, as Richard Cyganiak wrote at the time, that we have 
>> to work around the quirks [2], nonetheless, if there is no real need, 
>> why should we work around the quirks caused by a pointless identity 
>> assumption?
>
> I feel misquoted. In the original discussion [1], I encouraged the use 
> of owl:sameAs between three different definitions (Geonames, GEMET and 
> DBpedia) of the concept of a “canal”. I did *not* advocate to gloss 
> over the difference between a thing and a document about that thing, 
> as you imply by your examples above. To the contrary, I have insisted 
> on this difference many times, e.g. in [2].
>
> At the end of the day, we have to keep in mind that we are talking 
> about the Web. Statements will be subjective, inconsistent and wrong. 
> This also applies to owl:sameAs statements. They are claims, not 
> facts. Deal with it.
>
> Best,
> Richard
>
> [1] 
> http://simile.mit.edu/mail/ReadMsg?listName=Linking%20Open%20Data&msgId=14215 
>
> [2] http://www.w3.org/TR/cooluris/
>
>> Notice that ignoring owl:sameAs is not a good solution. We need some 
>> trade-off between simplicity
>> and formality. A basic similarity relation is perfect, and then those 
>> triples can be worked out automatically,
>> by means of appropriate metamodels, e.g. as proposed in [3].
>>
>> Aldo
>>
>> [1] Bernard Vatant suggested some good practice of mutual linking:
>> http://universimmedia.blogspot.com/2007/07/using-owlsameas-in-linked-data.html 
>>
>>
>> [2] Cyganiak quote:
>>> People who want to re-use your data will learn to work around its 
>>> quirks and idiosyncrasies.
>>> Dealing with the quirks is a part of re-using data, it always was, 
>>> and it always will be.
>>>
>>
>> [3] http://www.ibiblio.org/hhalpin/irw2006/vpresutti.pdf from IRW 
>> workshop: http://www.ibiblio.org/hhalpin/irw2006/
>>
>>
>> _________________________________
>>
>> Aldo Gangemi
>>
>> Senior Researcher
>> Laboratory for Applied Ontology
>> Institute for Cognitive Sciences and Technology
>> National Research Council (ISTC-CNR)
>> Via Nomentana 56, 00161, Roma, Italy
>> Tel: +390644161535
>> Fax: +390644161513
>> aldo.gangemi@cnr.it
>>
>> http://www.loa-cnr.it/gangemi.html
>>
>> icq# 108370336
>>
>> skype aldogangemi

-- 

*Bernard Vatant
*Knowledge Engineering
----------------------------------------------------
*Mondeca**
*3, cité Nollez 75018 Paris France
Web:    www.mondeca.com <http://www.mondeca.com>
----------------------------------------------------
Tel:       +33 (0) 971 488 459
Mail:     bernard.vatant@mondeca.com <mailto:bernard.vatant@mondeca.com>
Blog:    Leçons de Choses <http://mondeca.wordpress.com/>
Received on Thursday, 15 May 2008 12:38:43 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 21:45:22 GMT