Re: {Disarmed} Re: Managing Co-reference (Was: A Semantic Elephant?)

Michael,
Many thanks for asking the question.
It is very exciting to see this discussion so active.
I have been trying to get to the front of the messages to say something, but they just keep coming in!
To answer you email:
Yes, we have an infrastructure (the Consistent Reference Service, CRS) with which we have been trying to manage co-reference between a bunch of independent SW sites to allow applications to do what they need. It has gone through quite a few revisions over the last four years or so.


On 15/05/2008 00:25, "Michael F Uschold" <uschold@gmail.com> wrote:

Aldo notes the problems with using owl:sameAs to mean similarity. Such uses are often incorrect, and Aldo suggests using something like rdfs:seeAlso, skos:related, instead. These relations are too weak, unfortunately.

There is an interesting proposal for managing URI snyonyms that attempts to have a middle ground, weaker than owl:sameAs, but much stronger than rdfs:seeAlso or skos:related.   They suggest an infrastructural approach [apparently] outside the logic for managing URI synonyms. It is a quite clever approach, but still has some challenges.  Here are portions of a note I just sent the authors of a paper, which relates to this question.

Afraz, Hugh and Ian:

I just read your workshop paper:
Managing URI Synonymity to Enable Consistent Reference on the Semantic Web <http://eprints.ecs.soton.ac.uk/15614/1/camera-ready.pdf><http://eprints.ecs.soton.ac.uk/15614/1/camera-ready.pdf>


 1.  I wholeheartedly agree that owl:SameAs is too strong in many cases. A weaker relation is needed. However, you don't offer weaker relation and give it semantics. Instead, you do a kind of sleight of hand and remove it from the logic.  Without  a semantics, what is a system developer to do with the fact that two URIs are in the same bundle?  What are the inferential impliciations?
 2.
 3.  Example: IMHO it is a bad idea to say that Spain the political entity is the same as Spain the geopolicial region. This ontological distinction has been clear documented in DOLCE, for example. They are different, and should have different URIs.  Conflating them will cause problems.  Of course, making this and many other ontologically 'sound' distinctions can cause its own problems, by adding complexity -- a tradeoff. Without any semantics of inCRS_Bundle, there is no way to tell if it is semantically correct.
 4.  Do you have any idea of the scalability of this approach?

Michael



On Wed, May 14, 2008 at 2:24 PM, Aldo Gangemi <aldo.gangemi@cnr.it> wrote:
       * Problem 2) even if you can find the links, prolific use of owl:sameAs will create computational problems.



Michael,

there is an item related to Problem 2), already discussed on LOD and elsewhere last year, i.e. the use of
owl:sameAs, which is a formal relation of identity, to denote generic "similarity", or even "relatedness"
between two entities.

owl:sameAs is great to co-reference persons, places, etc. It is buggy when used to relate e.g. foaf:Person
instances to persons' homepages, or a city as from Cyc to a wikipedia article of that city (as done in DBpedia).

In previous discussions, besides some weak good practices [1], I found no attempt to discourage its use for similarity.
This use is not needed. We can use e.g. rdfs:seeAlso, skos:related, or any other local relation instead.

It is reasonable, as Richard Cyganiak wrote at the time, that we have to work around the quirks [2],
nonetheless, if there is no real need, why should we work around the quirks caused by a pointless identity
assumption?

Notice that ignoring owl:sameAs is not a good solution. We need some trade-off between simplicity
and formality. A basic similarity relation is perfect, and then those triples can be worked out automatically,
by means of appropriate metamodels, e.g. as proposed in [3].

Aldo

[1] Bernard Vatant suggested some good practice of mutual linking:
http://universimmedia.blogspot.com/2007/07/using-owlsameas-in-linked-data.html

[2] Cyganiak quote:
People who want to re-use your data will learn to work around its quirks and idiosyncrasies.
Dealing with the quirks is a part of re-using data, it always was, and it always will be.


[3] MailScanner has detected definite fraud in the website at "www.ibiblio.org". Do not trust this website: http://www.ibiblio.org/hhalpin/irw2006/vpresutti.pdf <http://www.ibiblio.org/hhalpin/irw2006/vpresutti.pdf><http://www.ibiblio.org/hhalpin/irw2006/vpresutti.pdf>  from IRW workshop: MailScanner has detected definite fraud in the website at "www.ibiblio.org". Do not trust this website: http://www.ibiblio.org/hhalpin/irw2006/ <http://www.ibiblio.org/hhalpin/irw2006/><http://www.ibiblio.org/hhalpin/irw2006/>


_________________________________

Aldo Gangemi

Senior Researcher
Laboratory for Applied Ontology
Institute for Cognitive Sciences and Technology
National Research Council (ISTC-CNR)
Via Nomentana 56, 00161, Roma, Italy
Tel: +390644161535
Fax: +390644161513
aldo.gangemi@cnr.it

http://www.loa-cnr.it/gangemi.html

icq# 108370336

skype aldogangemi

Received on Thursday, 15 May 2008 19:03:38 UTC