Re: Quality Criteria for Linked Data sources

Dear Annika,

great work and a really nice fusion of the classic data quality criteria that one finds in the literature from databases with Linked Data specific aspects.

Three comments:

1. Your criteria seam to focus mainly on the publication of instance data and do not say too much about the schema level. The overall goal of Linked Data is to publish data in a self-descriptive way [1], which means that you should not only set links on instance level, but you should also set links on schema level relating terms from different vocabularies to each other. This especially applies when you use proprietary terms, which cannot always be avoided. Thus, maybe you still want to add some criteria about providing definitions for proprietary vocabulary terms and setting links between different vocabularies to your list.

2. Your criteria in the category content are only a subset of the usual content-oriented criteria in literature (for summaries see for instance [2][3]). I guess you had reasons not to include all, but maybe you want to check against these lists again.

3. If you want talk in your thesis about the compliance of existing data sources on the Web with the quality criteria, the statistics about the compliance with different publishing best practices in the State of the LOD Cloud document [4] could be a good starting point.

Please also circulate a link to your thesis on this list once you have finished it. It appears like this is going to be an interesting read :-)   

Cheers,

Chris

[1] http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html
[2] http://portal.acm.org/citation.cfm?id=1791545
[3] http://www.diss.fu-berlin.de/diss/receive/FUDISS_thesis_000000002736
[4] http://www4.wiwiss.fu-berlin.de/lodcloud/state/



-----Ursprüngliche Nachricht-----
Von: public-lod-request@w3.org [mailto:public-lod-request@w3.org] Im Auftrag von Annika Flemming
Gesendet: Mittwoch, 15. Dezember 2010 20:50
An: public-lod@w3.org
Betreff: Quality Criteria for Linked Data sources

Hi,
I'm a student at the Humboldt University of Berlin and I'm currently writing my diploma thesis under the supervision of Olaf Hartig. The aim of my thesis is to draw up a set of criteria to assess the quality of Linked Data sources. My findings include eleven criteria grouped into four categories. Each criterion includes a set of so-called indicators. These indicators constitute a measurable aspect of a criterion and, thus, allow for the assessment of the quality of a data source w.r.t the criteria.
I've written a summary of my findings, which can be accessed here:

http://sourceforge.net/apps/mediawiki/trdf/index.php?title=Quality_Criteria_for_Linked_Data_sources

To evaluate my findings, I decided to post this summary hoping to receive some feedback about the criteria and indicators I suggested. Moreover, I'd like to initiate a discussion about my findings, and about their applicability to a quality assessment of data sources.

Your comments might be included in my thesis, but I won't add any names.

A further summary will follow shortly, describing a formalism based on these criteria and its application to several data sources.

Thanks to everyone participating,
Annika
-- 
GRATIS! Movie-FLAT mit über 300 Videos. 
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome

Received on Thursday, 16 December 2010 09:20:01 UTC