- From: Michael F Uschold <uschold@gmail.com>
- Date: Thu, 7 Oct 2010 10:12:38 -0700
- To: Paul Houle <ontology2@gmail.com>
- Cc: Martin Hepp <martin.hepp@ebusiness-unibw.org>, Karl Dubost <karl+w3c@la-grange.net>, public-lod@w3.org, semantic-web@w3.org
- Message-ID: <AANLkTimF9pe_tjPmsj10hrdS4POtkw38+U_Eta+s+kPP@mail.gmail.com>
These things that bug you do so with good reason. I often call it semantic infidelity. For an in depth discussion of a closely related issue see: Overloading OWL sameAs<http://ontologydesignpatterns.org/wiki/Community:Overloading_OWL_sameAs>A summary is given below. Michael *Issue: *owl:sameAs is being used in the linked data community in a way that is inconsistent with its semantics. *Source*: Numerous, this issue has been discussed over and over on various lists. The summary so far is mainly based on a discussion that was originally about the proliferation of URIs and managing co-reference, and evolved into a discussion about owl:sameAs *per se*. - W3C Semantic Web List<http://lists.w3.org/Archives/Public/semantic-web/>: Managing Co-reference (Was: A Semantic Elephant?)<http://lists.w3.org/Archives/Public/semantic-web/2008May/0126.html>May 2008 - W3C Semantic Web List<http://lists.w3.org/Archives/Public/semantic-web/>: ISBNs, owl:sameAs, etc<http://lists.w3.org/Archives/Public/semantic-web/>December 2009 *Related Discussions: * - [linking open data] Open Data&msgId=19328 URI aliases and owl:sameAs was: Terminology Question<http://simile.mit.edu/mail/ReadMsg?listName=Linking> - W3C public-lod sameAs proliferation (was Visualizing LOD Linkage)<http://www.mail-archive.com/public-lod@w3.org/msg00663.html>August 2008 - W3C public-lod owl:sameAs links from OpenCyc to WordNet<http://lists.w3.org/Archives/Public/public-lod/2009Feb/0186.html>February 2009 - W3C semantic-web-lifesci owl:sameAs and identity [was Re: blog: semantic dissonance in uniprot<http://lists.w3.org/Archives/Public/public-semweb-lifesci/2009Mar/0169.html>] March 2009 - [tbc-users<http://www.mail-archive.com/topbraid-composer-users@googlegroups.com/msg00994.html>counting and owl:sameAs] April 2009 - W3C public-lod how do I report bad sameAs links? (dbpedia <-> Cyc)<http://lists.w3.org/Archives/Public/public-lod/2009Jun/0443.html>June 2009 - W3C public-lod sameas.org<http://lists.w3.org/Archives/Public/public-lod/2009Jun/0038.html>June 2009 - W3C public-lod A "sameas" widget for Firefox<http://www.mail-archive.com/public-lod@w3.org/msg02554.html>June 2009 - W3C public-lod owl:sameAs [recipe<http://lists.w3.org/Archives/Public/public-lod/2009Jul/0306.html>] July 2009 - W3C public-lod SKOS, owl:sameAs and DBpedia<http://lists.w3.org/Archives/Public/public-lod/2010Mar/0215.html>March 2010 *Related Modeling Issues*: - Versioning and URIs<http://ontologydesignpatterns.org/wiki/Community:Versioning_and_URIs> - Proliferation of URIs, Managing Coreference<http://ontologydesignpatterns.org/wiki/Community:Proliferation_of_URIs%2C_Managing_Coreference> *Examples:* - relating a foaf:Person instance to the person's home page. - relating a geographical region with a political entity. For example, the physical area that a city occupies with the city itself. - relating the DBpedia resource referring to a place with to a GeoNames resource corresponding to that same place *Conclusions:* There is a lot of confusion about how owl:sameAs should be used in the linked open data community. It is being used in ways that are semantically incorrect and can give incorrect inferences. A number of points and suggestions came up. 1. There is frequent tendency to use sameAs to link resources that provide information about something to resources that represent the thing. E.g. relating a resource denoting a book to a resource that is the Amazon page for the book. 2. There is a tradeoff between formal accuracy on the one hand and pragmatic usefulness on the other hand. It often arises that treating things as the same has the desired behavior. Rather than being harmful, the vagueness can be an advantage. 3. It was proposed that a weaker similarity relationship be created to be used instead of sameAs when there is not true identity between the two resources. Some argued that there already are alternatives, e.g. skos:related and rdfs:seeAlso 4. Arguments were given pro and con, as to whether the new relationship should have a formal semantics. One proposal creates a mechanism that removes it from the logic entirely See: Managing URI Synonymity to Enable Consistent Reference on the Semantic Web<http://eprints.ecs.soton.ac.uk/15614/1/camera-ready.pdf>. If the formal semantics is important, should the similarity relation 1. be a relation in the logical vocabulary of OWL, as sameAs is? -or- 2. be just a relation in an ontology? 5. Having too many ways to specify similarity might be confusing and hinder uptake of the technology. 6. A suggestion was made to have owl:sameAs links made in separate files so that they can easily be excluded. 7. A suggestion was made that there be specific guidelines and practices between owners of data in how they reach agreement on what should be linked. See: Bernard Vatant suggested some good practice of mutual linking<http://blog.hubjects.com/2007/07/using-owlsameas-in-linked-data.html> On Thu, Oct 7, 2010 at 8:56 AM, Paul Houle <ontology2@gmail.com> wrote: > On Wed, Oct 6, 2010 at 5:09 PM, Martin Hepp < > martin.hepp@ebusiness-unibw.org> wrote: > I've got mixed feelings about "snippets" vs "fully embeded RDFa". For > the most part I think systems that use snippets will be more maintainable, > but I've seen cases where fully embedded RDFa fits very well into a system > and there may be cases where the size of the HTML can be reduced by using it > -- and HTML size is a big deal in the real world where loading time matters > and we're increasingly targeting mobile devices. > > The RDFa issue that really bugs me is that a linked data URI can be > read to signify a number of different things. Consider, for instance, > > http://dbpedia.org/resource/Rainbow_Bridge_(Tokyo)<http://dbpedia.org/resource/Rainbow_Bridge_%28Tokyo%29> > > (i) This is a string. It has a length. It uses a particular subset of > available characters > (ii) This is a URI. It has a scheme, it has a host, path, might have a > # in it, query strings, all that; a number of assertions can be made > about it as a URI > (iii) This is a document. We can assert the "content-type" of this > document (or at least one version we've negotiated), we can assert it's > charset, length in bytes, length in characters, particular subset of > available characters used, number of triples asserted directly in the > document, the number of triples we can infer by applying certain rules to > this in connection with a certain knowledgebase, and on and on > (iv) This is about a wikipedia article (some wikipedia articles don't map > cleanly to a named entity) > (v) This is about a named entity > > The more I think about it, the more I it bugs me, and it's all the worse > when you've using RDFa and you've got HTML documents. > > For instance, you could clearly see > > http://ookaboo.com/o/pictures/topic/28999/Beijing > > as a signifier for a city. Some people would make the assertion that > > dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing. > > and that's not entirely stupid. On the other hand, it's definitely true > that > > ookaboo:topic/28999/Beijing is sioc:ImageGallery. > > Put something true together with a practice that's common and you get the > absurd result that > > dbpedia:Beijing is sioc:ImageGallery. > > -- Michael Uschold, PhD LinkedIn: http://tr.im/limfu Skype: UscholdM
Received on Thursday, 7 October 2010 17:13:13 UTC