W3C home > Mailing lists > Public > semantic-web@w3.org > April 2011

Re: How To Do Deal with the Subjective Issue of Data Quality?

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 07 Apr 2011 16:38:37 -0400
Message-ID: <4D9E20CD.3000204@openlinksw.com>
To: Jiří Procházka <ojirio@gmail.com>
CC: "public-lod@w3.org" <public-lod@w3.org>, "semantic-web@w3.org" <semantic-web@w3.org>, "dbpedia-discussion@lists.sourceforge.net" <dbpedia-discussion@lists.sourceforge.net>
On 4/7/11 3:06 PM, Jiří Procházka wrote:
> Hi,
> I think the different aspects of data quality should be specified by
> parties who are interested in them, published using proper ontologies
> and new human terms, and the ambiguous term "data quality" be used less
> and less.


And it shouldn't be used as a perennial distraction mechanism re. Linked 
Data. There is no such thing as perfect data.

> You quite hit the nail on the head with the recognition of the
> aesthetic nature of the term.
> Individual campaigns like some 5-star schemes are not necessarily bad,
> if they recognize their specificity to a particular purpose.
> Although I prefer a certificate-compliance-like scheme to the star
> scheme which kind of supports the (false) notion of the objective data
> quality.

Thus when dealing with data driven *anything* the following separation 
of powers remain:

1. Data Presentation
2. Data Representation
3. Data Access Protocol
4. Data Query Language
5. Data Model
6. Actual Data accessible from a Location .

The separations above are sometimes overlooked in the context of many 
Linked Data initiatives and demos. Inaccurate data at an Address doesn't 
render applications, services, or demos scoped to points 1-5 (above) 
useless. In fact, bad data can be very useful [1] :-)


1. http://jeffjonas.typepad.com/jeff_jonas/2006/12/it_turns_out_bo.html 
-- It Turns Out Both Bad Data and a Teaspoon of Dirt May Be Good For You 
(Jeff Jonas post) .


> Best,
> Jiri
> On 04/07/2011 08:06 PM, Kingsley Idehen wrote:
>> All,
>> Apologies for cross posting this repeatedly. I think I have a typo free
>> heading for this topic.
>> Increasingly, the issue of data quality pops up as an impediment to
>> Linked Data value proposition comprehension and eventual exploitation.
>> The same issue even appears to emerge in conversations that relate to
>> "sense making" endeavors that benefit from things such as OWL reasoning
>> e.g., when resolving the multiple Identifiers with a common Referent via
>> owl:sameAs or exploitation of fuzzy rules based on
>> InverseFunctionProperty relations.
>> Personally, I subscribe to the doctrine that "data quality" is like
>> "beauty" it lies strictly in the eyes of the beholder i.e., a function
>> of said beholders "context lenses".
>> I am posting primarily to open up a discussion thread for this important
>> topic.



Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Thursday, 7 April 2011 20:39:00 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:24 UTC