W3C home > Mailing lists > Public > public-lod@w3.org > April 2011

Re: 15 Ways to Think About Data Quality (Just for a Start)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 12 Apr 2011 08:36:08 -0400
Message-ID: <4DA44738.5000409@openlinksw.com>
To: Norman Gray <norman@astro.gla.ac.uk>
CC: glenn mcdonald <gmcdonald@furia.com>, "public-lod@w3.org" <public-lod@w3.org>
On 4/12/11 3:49 AM, Norman Gray wrote:
> Glenn and all, greetings.
> On 2011 Apr 9, at 03:10, glenn mcdonald wrote:
>> I don't think data quality is an amorphous, aesthetic, hopelessly subjective
>> topic. Data "beauty" might be subjective, and the same data may have
>> different applicability to different tasks, but there are a lot of obvious
>> and straightforward ways of thinking about the quality of a dataset
>> independent of the particular preferences of individual beholders. Here are
>> just some of them:
> This is an excellent list.  I think only a minority of these qualities could be scored precisely, but I think all of them could be scored on some awful-to-excellent scale, so that while they may not be quite objective metrics, they're at least clearly debatable.
> Complete objectivity is probably impossible here -- inevitable in a world where the concept of 'Rome' means significantly different things to the local authority, the ancient historian, and the tourist board.  But 'solves my problem well' is a pretty good substitute.
> Best wishes,
> Norman

Great insight!

Glenn: this is why my demos are oriented towards enabling the beholder 
disambiguate his/her/its quest via filtering applied to entity types and 
other properties. My entire focus in on this very point outlined by 
Norman i.e., dealing with it at massive scales. You cannot enforce 
anything on the beholder of data. There are many scenarios where 
subjectively bad data is extremely good data.



Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Tuesday, 12 April 2011 12:36:32 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:13 UTC