W3C home > Mailing lists > Public > public-lod@w3.org > April 2011

Re: 15 Ways to Think About Data Quality (Just for a Start)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 12 Apr 2011 10:54:31 -0400
Message-ID: <4DA467A7.2000709@openlinksw.com>
To: glenn mcdonald <glenn@furia.com>
CC: "public-lod@w3.org" <public-lod@w3.org>
On 4/12/11 10:45 AM, glenn mcdonald wrote:
>     http://lod.openlinksw.com/fct/rdfdesc/usage.vsp?g=http%3A%2F%2Fdbpedia.org%2Fresource%2FMichael_Jackson&tp=2
>     <http://lod.openlinksw.com/fct/rdfdesc/usage.vsp?g=http%3A%2F%2Fdbpedia.org%2Fresource%2FMichael_Jackson&tp=2>
>     . That's how you discern its from OpenCyc since each datasets is
>     loaded into its now Named Graph.
> I follow that link and I see that the Michael Jackon entity has 503 
> references in dbpedia, 2 each in the opencyc and opencyc-readable 
> subsets (I *think* those are included in the 503, but I'm not sure) 
> and a couple others. I still don't see how to trace the provenance of 
> an individual triple.
>     Even easier: follow the link, the copy the value of @href from
>     "About: XYZ.." or just click on the About: XYZ hyperlink and
>     you'll find yourself in the OpenCyc data space :-)
> I'm sure you don't meant this seriously. The URI of the entity 
> definitely doesn't answer the question of where the triples that refer 
> to that URI come from...

Again, each Dataset loaded into the Virtuoso Database ends up with its 
own Named Graph IRI. The triples associated with a Named Graph IRI have 
"authority" parts in their URIs that indication of their origin.

All we did was load the OpenCyc linkset (as proven by the inverse 
relation in the DBpedia owl:sameAs relation) into its own Named Graph.

The link: 
is about a page that shows the Named Graph IRIs holding triples 
association with the Subject of a specific Description.

Cut long story short, OpenCyc are already fixing there DBpedia linkset 
across TBox and ABox dimensions, this is an item that's been in progress 
for a while now. Seeing the data helps people understand the 
implications the data. If you can't see the data there's nothing to fix, 
thus we end up in a subjective "fools paradise".



Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Tuesday, 12 April 2011 14:54:54 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:13 UTC