W3C home > Mailing lists > Public > public-lod@w3.org > April 2011

Re: 15 Ways to Think About Data Quality (Just for a Start)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 12 Apr 2011 14:15:24 -0400
Message-ID: <4DA496BC.8040104@openlinksw.com>
To: glenn mcdonald <glenn@furia.com>
CC: "public-lod@w3.org" <public-lod@w3.org>
On 4/12/11 1:52 PM, glenn mcdonald wrote:
>
>     You continue to imply that seeing subjectively imperfect data
>     projected via a data oriented tool is problematic re., your "total
>     data experience" world view.
>
>
> I continue to think it's hilarious that you consider it "subjectively 
> imperfect" that your dataset says Michael Jackson and Michael Rodrick 
> are the same person. What would constitute "objectively imperfect" to you?

The problem is this: I isn't my dataset. It's data loaded into an 
instance of Virtuoso.

>
> So yes, I think you should feel a little embarrassed about 
> broadcasting links to a demo in which the very first piece of data one 
> sees is obviously wrong.

To you the first piece of that is an owl:sameAs assertion. That's 100% 
fine for you, but that isn't true for everyone else. It just isn't.

> You've got billions of entities in dbpedia, and the technology doesn't 
> care which one you pick, so surely you could pick one where the errors 
> aren't as prominent.

No, DBpedia doesn't have a billions of entities, that just one dataset. 
The Virtuoso instance in question is a LOD cloud cache instance i.e., 
we've loaded the available datasets into the instance. From that I 
produce a variety of demos. Just as anyone else can since the endpoints 
are all public.

> The fact that you didn't, and don't seem to care, sends a message 
> about your attitude towards data.

Again, context infidelity. In due course you will understand my point. 
For now, we can go back an forth. You characterization is 100% inaccurate.


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Tuesday, 12 April 2011 18:15:47 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:32 UTC