W3C home > Mailing lists > Public > public-lod@w3.org > October 2010

Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages

From: Michael F Uschold <uschold@gmail.com>
Date: Thu, 7 Oct 2010 10:12:38 -0700
Message-ID: <AANLkTimF9pe_tjPmsj10hrdS4POtkw38+U_Eta+s+kPP@mail.gmail.com>
To: Paul Houle <ontology2@gmail.com>
Cc: Martin Hepp <martin.hepp@ebusiness-unibw.org>, Karl Dubost <karl+w3c@la-grange.net>, public-lod@w3.org, semantic-web@w3.org
These things that bug you do so with good reason.  I often call it semantic
infidelity. For an in depth discussion of a closely related issue see:
Overloading
OWL sameAs<http://ontologydesignpatterns.org/wiki/Community:Overloading_OWL_sameAs>A
summary is given below.

Michael

*Issue: *owl:sameAs is being used in the linked data community in a way that
is inconsistent with its semantics.

*Source*: Numerous, this issue has been discussed over and over on various
lists. The summary so far is mainly based on a discussion that was
originally about the proliferation of URIs and managing co-reference, and
evolved into a discussion about owl:sameAs *per se*.

   - W3C Semantic Web List<http://lists.w3.org/Archives/Public/semantic-web/>:
   Managing Co-reference (Was: A Semantic
Elephant?)<http://lists.w3.org/Archives/Public/semantic-web/2008May/0126.html>May
2008
   - W3C Semantic Web List<http://lists.w3.org/Archives/Public/semantic-web/>:
   ISBNs, owl:sameAs,
etc<http://lists.w3.org/Archives/Public/semantic-web/>December 2009

*Related Discussions: *

   - [linking open data] Open Data&msgId=19328 URI aliases and owl:sameAs
   was: Terminology
Question<http://simile.mit.edu/mail/ReadMsg?listName=Linking>


   - W3C public-lod sameAs proliferation (was Visualizing LOD
Linkage)<http://www.mail-archive.com/public-lod@w3.org/msg00663.html>August
2008
   - W3C public-lod owl:sameAs links from OpenCyc to
WordNet<http://lists.w3.org/Archives/Public/public-lod/2009Feb/0186.html>February
2009
   - W3C semantic-web-lifesci owl:sameAs and identity [was Re: blog:
   semantic dissonance in
uniprot<http://lists.w3.org/Archives/Public/public-semweb-lifesci/2009Mar/0169.html>]
   March 2009
   - [tbc-users<http://www.mail-archive.com/topbraid-composer-users@googlegroups.com/msg00994.html>counting
and owl:sameAs] April 2009
   - W3C public-lod how do I report bad sameAs links? (dbpedia <->
Cyc)<http://lists.w3.org/Archives/Public/public-lod/2009Jun/0443.html>June
2009
   - W3C public-lod
sameas.org<http://lists.w3.org/Archives/Public/public-lod/2009Jun/0038.html>June
2009
   - W3C public-lod A "sameas" widget for
Firefox<http://www.mail-archive.com/public-lod@w3.org/msg02554.html>June
2009
   - W3C public-lod owl:sameAs
[recipe<http://lists.w3.org/Archives/Public/public-lod/2009Jul/0306.html>]
   July 2009
   - W3C public-lod SKOS, owl:sameAs and
DBpedia<http://lists.w3.org/Archives/Public/public-lod/2010Mar/0215.html>March
2010

*Related Modeling Issues*:

   - Versioning and
URIs<http://ontologydesignpatterns.org/wiki/Community:Versioning_and_URIs>
   - Proliferation of URIs, Managing
Coreference<http://ontologydesignpatterns.org/wiki/Community:Proliferation_of_URIs%2C_Managing_Coreference>

*Examples:*

   - relating a foaf:Person instance to the person's home page.
   - relating a geographical region with a political entity. For example,
   the physical area that a city occupies with the city itself.
   - relating the DBpedia resource referring to a place with to a GeoNames
   resource corresponding to that same place

*Conclusions:*

There is a lot of confusion about how owl:sameAs should be used in the
linked open data community. It is being used in ways that are semantically
incorrect and can give incorrect inferences. A number of points and
suggestions came up.

   1. There is frequent tendency to use sameAs to link resources that
   provide information about something to resources that represent the thing.
   E.g. relating a resource denoting a book to a resource that is the Amazon
   page for the book.
   2. There is a tradeoff between formal accuracy on the one hand and
   pragmatic usefulness on the other hand. It often arises that treating things
   as the same has the desired behavior. Rather than being harmful, the
   vagueness can be an advantage.
   3. It was proposed that a weaker similarity relationship be created to be
   used instead of sameAs when there is not true identity between the two
   resources. Some argued that there already are alternatives, e.g.
   skos:related and rdfs:seeAlso
   4. Arguments were given pro and con, as to whether the new relationship
   should have a formal semantics. One proposal creates a mechanism that
   removes it from the logic entirely See: Managing URI Synonymity to Enable
   Consistent Reference on the Semantic
Web<http://eprints.ecs.soton.ac.uk/15614/1/camera-ready.pdf>.
   If the formal semantics is important, should the similarity relation
      1. be a relation in the logical vocabulary of OWL, as sameAs is? -or-
      2. be just a relation in an ontology?
   5. Having too many ways to specify similarity might be confusing and
   hinder uptake of the technology.
   6. A suggestion was made to have owl:sameAs links made in separate files
   so that they can easily be excluded.
   7. A suggestion was made that there be specific guidelines and practices
   between owners of data in how they reach agreement on what should be linked.
   See: Bernard Vatant suggested some good practice of mutual
linking<http://blog.hubjects.com/2007/07/using-owlsameas-in-linked-data.html>



On Thu, Oct 7, 2010 at 8:56 AM, Paul Houle <ontology2@gmail.com> wrote:

> On Wed, Oct 6, 2010 at 5:09 PM, Martin Hepp <
> martin.hepp@ebusiness-unibw.org> wrote:
>     I've got mixed feelings about "snippets" vs "fully embeded RDFa".  For
> the most part I think systems that use snippets will be more maintainable,
> but I've seen cases where fully embedded RDFa fits very well into a system
> and there may be cases where the size of the HTML can be reduced by using it
> -- and HTML size is a big deal in the real world where loading time matters
> and we're increasingly targeting mobile devices.
>
>     The RDFa issue that really bugs me is that a linked data URI can be
> read to signify a number of different things.  Consider,  for instance,
>
> http://dbpedia.org/resource/Rainbow_Bridge_(Tokyo)<http://dbpedia.org/resource/Rainbow_Bridge_%28Tokyo%29>
>
> (i) This is a string.  It has a length.  It uses a particular subset of
> available characters
> (ii) This is a URI.  It has a scheme,  it has a host,  path,  might have a
> # in it,  query strings,  all that;  a number of assertions can be made
> about it as a URI
> (iii) This is a document.  We can assert the "content-type" of this
> document (or at least one version we've negotiated),  we can assert it's
> charset,  length in bytes,  length in characters,  particular subset of
> available characters used,  number of triples asserted directly in the
> document,  the number of triples we can infer by applying certain rules to
> this in connection with a certain knowledgebase,  and on and on
> (iv) This is about a wikipedia article (some wikipedia articles don't map
> cleanly to a named entity)
> (v) This is about a named entity
>
> The more I think about it,  the more I it bugs me,  and it's all the worse
> when you've using RDFa and you've got HTML documents.
>
> For instance,  you could clearly see
>
> http://ookaboo.com/o/pictures/topic/28999/Beijing
>
> as a signifier for a city.  Some people would make the assertion that
>
> dbpedia:Beijing owl:sameAs ookaboo:topic/28999/Beijing.
>
> and that's not entirely stupid.  On the other hand,  it's definitely true
> that
>
> ookaboo:topic/28999/Beijing is sioc:ImageGallery.
>
> Put something true together with a practice that's common and you get the
> absurd result that
>
> dbpedia:Beijing is sioc:ImageGallery.
>
>



-- 
Michael Uschold, PhD
   LinkedIn: http://tr.im/limfu
   Skype: UscholdM
Received on Thursday, 7 October 2010 17:13:12 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:29 UTC