W3C home > Mailing lists > Public > semantic-web@w3.org > May 2008

Re: Managing Co-reference (Was: A Semantic Elephant?)

From: Aldo Gangemi <aldo.gangemi@istc.cnr.it>
Date: Fri, 16 May 2008 06:30:59 +0200
Cc: Aldo Gangemi <aldo.gangemi@cnr.it>, "Hugh Glaser" <hg@ecs.soton.ac.uk>, "Tim Berners-Lee" <timbl@w3.org>, "Sören Auer" <auer@informatik.uni-leipzig.de>, "Semantic Web Interest Group" <semantic-web@w3.org>, "Chris Bizer" <chris@bizer.de>, "Frank van Harmelen" <Frank.van.Harmelen@cs.vu.nl>, "Kingsley Idehen" <kidehen@openlinksw.com>, "Fabian M. Suchanek" <f.m.suchanek@gmail.com>, "Tim Berners-Lee" <timbl@csail.mit.edu>, "Jim Hendler" <hendler@cs.rpi.edu>, "Mark Greaves" <markg@vulcan.com>, "georgi.kobilarov@gmx.de" <georgi.kobilarov@gmx.de>, "Jens Lehmann" <lehmann@informatik.uni-leipzig.de>, "Richard Cyganiak" <richard@cyganiak.de>, "Frederick Giasson" <fred@fgiasson.com>, "Michael Bergman" <mike@mkbergman.com>, "Conor Shankey" <cshankey@reinvent.com>, "Kira Oujonkova" <koujonkova@reinvent.com>, "a.o.jaffri@ecs.soton.ac.uk" <a.o.jaffri@ecs.soton.ac.uk>, "icm@ecs.soton.ac.uk" <icm@ecs.soton.ac.uk>
Message-Id: <E9E14754-2DF6-4938-BFE1-46F4EA205803@cnr.it>
To: Michael F Uschold <uschold@gmail.com>
Contribute to recap. Very useful discussion indeed.

Issue 1: managing to suggest the rationale of owl:sameAs  
appropriately, i.e. in a harmless way for future usages (Aldo, Michael)
Issue 2: distinguishing "data provision" vs. "representational" usages  
of owl:sameAs (Yves)
Issue 3: need for another operator, e.g. representing equality under a  
closed set of properties (Geoff, Harry), or some relaxed rdfs:sameAs  
  Issue 3a: using another existing relation, such as skos:related or  
rdfs:seeAlso, but these are either too weak (rdfs:seeAlso), or  
constrained (skos:related)
Issue 4: need for a semiotic grasp over co-reference, maybe outside  
formal semantics (Bernard, Peter)

On issue (1), it seems that either most people agree, or they tend to  
prefer a discussion on issue (2), i.e. that when data provision is the  
intended usage,
the referential vagueness introduced by owl:sameAs in many cases is  
not harmful, but an advantage. As Hugh puts it: "we consider  
coreference as more knowledge about things, which can be represented  
in the SW, and can be used by applications if and when they see fit.  
And as someone said, there is no truth, only opinions.
So we need an infrastructure for opinions, but that is the SW."

But at this point, others switch to issue (3), and say (including me)  
that, if this is the case, it would be better to choose/define a  
different operator that ensures a safe semantics, instead of relying  
on an actual identity operator like owl:sameAs.
Finally, some nasty :) semioticians subtly suggest that we need some  
way out of formal semantics, in order to represent a kind of  
"similarity semantics". As Peter puts it: "Everyone talks about  
meaning without saying what it is they are trying to achieve by  
agreeing on a formal meaning".

My position, which I had when proposing the thread, is that we need to  
use things as efficiently as possible, without creating areas for  
useless wrong inferences.
Another operator would be perfect, but (as Michael, Geoff, Harry  
require) it should provide the user with some serious features,  
comparable to what owl:sameAs does. Therefore, we need to talk about  
similarity, equality, etc. at the metalevel, just as owl:sameAs does,  
since it is a relation in the logical vocabulary of OWL, not a  
relation from a specific ontology.
Most problems of co-referencing and identity seem to arise from the  
collapsing of the distinction between entities and information that  
denotes those entities (as noticed by Bernard, Harry, Aldo), be it  
dependent on some "meaning" or not. The metalevel we need to address  
is therefore the semiotic one, as correctly pointed in this discussion.

My proposal is that such metalevel is not necessarily "outside formal  
semantics", and some work from my group and elsewhere is proceeding in  
the direction of a reconciliation between a semiotic, social meaning,  
and the formal encoding of meaning.


Il giorno 16/mag/08, alle ore 02:02, Michael F Uschold ha scritto:

> Hugh Glaser said:
> At the bottom you will se that an agent can import this as ntriples
> owl:sameAs - we have nothing against owl:sameAs, if that is what the  
> agent
> wants to do, but the inference decision can be up to them.
> This idea has a lot of merit, IMHO. It allows people a safe way to  
> say that URIs are closely related, w/o going so far is making them  
> logically equivalent.  And you offer a convenient way to convert the  
> co-references to sameAs.
> This could make life a lot easier for folk who wish to load in a  
> variety of datasets and do reasoning on them.
> Pity I won't make it to the upcoming workshop.
> Michael
> Michael
> On Thu, May 15, 2008 at 11:59 AM, Hugh Glaser <hg@ecs.soton.ac.uk>  
> wrote:
> Michael,
> Many thanks for asking the question.
> It is very exciting to see this discussion so active.
> I have been trying to get to the front of the messages to say  
> something, but they just keep coming in!
> To answer you email:
> Yes, we have an infrastructure (the Consistent Reference Service,  
> CRS) with which we have been trying to manage co-reference between a  
> bunch of independent SW sites to allow applications to do what they  
> need. It has gone through quite a few revisions over the last four  
> years or so.
> On 15/05/2008 00:25, "Michael F Uschold" <uschold@gmail.com> wrote:
> Aldo notes the problems with using owl:sameAs to mean similarity.  
> Such uses are often incorrect, and Aldo suggests using something  
> like rdfs:seeAlso, skos:related, instead. These relations are too  
> weak, unfortunately.
> There is an interesting proposal for managing URI snyonyms that  
> attempts to have a middle ground, weaker than owl:sameAs, but much  
> stronger than rdfs:seeAlso or skos:related.   They suggest an  
> infrastructural approach [apparently] outside the logic for managing  
> URI synonyms. It is a quite clever approach, but still has some  
> challenges.  Here are portions of a note I just sent the authors of  
> a paper, which relates to this question.
> Afraz, Hugh and Ian:
> I just read your workshop paper:
> Managing URI Synonymity to Enable Consistent Reference on the  
> Semantic Web <http://eprints.ecs.soton.ac.uk/15614/1/camera- 
> ready.pdf><http://eprints.ecs.soton.ac.uk/15614/1/camera-ready.pdf>
>  1.  I wholeheartedly agree that owl:SameAs is too strong in many  
> cases. A weaker relation is needed. However, you don't offer weaker  
> relation and give it semantics. Instead, you do a kind of sleight of  
> hand and remove it from the logic.  Without  a semantics, what is a  
> system developer to do with the fact that two URIs are in the same  
> bundle?  What are the inferential impliciations?
>  2.
>  3.  Example: IMHO it is a bad idea to say that Spain the political  
> entity is the same as Spain the geopolicial region. This ontological  
> distinction has been clear documented in DOLCE, for example. They  
> are different, and should have different URIs.  Conflating them will  
> cause problems.  Of course, making this and many other ontologically  
> 'sound' distinctions can cause its own problems, by adding  
> complexity -- a tradeoff. Without any semantics of inCRS_Bundle,  
> there is no way to tell if it is semantically correct.
>  4.  Do you have any idea of the scalability of this approach?
> Michael
> On Wed, May 14, 2008 at 2:24 PM, Aldo Gangemi <aldo.gangemi@cnr.it>  
> wrote:
>       * Problem 2) even if you can find the links, prolific use of  
> owl:sameAs will create computational problems.
> Michael,
> there is an item related to Problem 2), already discussed on LOD and  
> elsewhere last year, i.e. the use of
> owl:sameAs, which is a formal relation of identity, to denote  
> generic "similarity", or even "relatedness"
> between two entities.
> owl:sameAs is great to co-reference persons, places, etc. It is  
> buggy when used to relate e.g. foaf:Person
> instances to persons' homepages, or a city as from Cyc to a  
> wikipedia article of that city (as done in DBpedia).
> In previous discussions, besides some weak good practices [1], I  
> found no attempt to discourage its use for similarity.
> This use is not needed. We can use e.g. rdfs:seeAlso, skos:related,  
> or any other local relation instead.
> It is reasonable, as Richard Cyganiak wrote at the time, that we  
> have to work around the quirks [2],
> nonetheless, if there is no real need, why should we work around the  
> quirks caused by a pointless identity
> assumption?
> Notice that ignoring owl:sameAs is not a good solution. We need some  
> trade-off between simplicity
> and formality. A basic similarity relation is perfect, and then  
> those triples can be worked out automatically,
> by means of appropriate metamodels, e.g. as proposed in [3].
> Aldo
> [1] Bernard Vatant suggested some good practice of mutual linking:
> http://universimmedia.blogspot.com/2007/07/using-owlsameas-in-linked-data.html
> [2] Cyganiak quote:
> People who want to re-use your data will learn to work around its  
> quirks and idiosyncrasies.
> Dealing with the quirks is a part of re-using data, it always was,  
> and it always will be.
> [3] MailScanner has detected definite fraud in the website at "www.ibiblio.org 
> ". Do not trust this website: http://www.ibiblio.org/hhalpin/irw2006/vpresutti.pdf 
>  <http://www.ibiblio.org/hhalpin/irw2006/vpresutti.pdf><http://www.ibiblio.org/hhalpin/irw2006/vpresutti.pdf 
> >  from IRW workshop: MailScanner has detected definite fraud in the  
> website at "www.ibiblio.org". Do not trust this website: http://www.ibiblio.org/hhalpin/irw2006/ 
>  <http://www.ibiblio.org/hhalpin/irw2006/><http://www.ibiblio.org/hhalpin/irw2006/ 
> >
> _________________________________
> Aldo Gangemi
> Senior Researcher
> Laboratory for Applied Ontology
> Institute for Cognitive Sciences and Technology
> National Research Council (ISTC-CNR)
> Via Nomentana 56, 00161, Roma, Italy
> Tel: +390644161535
> Fax: +390644161513
> aldo.gangemi@cnr.it
> http://www.loa-cnr.it/gangemi.html
> icq# 108370336
> skype aldogangemi


Aldo Gangemi

Senior Researcher
Laboratory for Applied Ontology
Institute for Cognitive Sciences and Technology
National Research Council (ISTC-CNR)
Via Nomentana 56, 00161, Roma, Italy
Tel: +390644161535
Fax: +390644161513


icq# 108370336

skype aldogangemi

Received on Friday, 16 May 2008 04:31:20 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:41:10 UTC