W3C home > Mailing lists > Public > public-lod@w3.org > October 2010

Re: Low Quality Data (was before Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices)

From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
Date: Mon, 25 Oct 2010 15:23:27 +0200
Cc: public-lod <public-lod@w3.org>, Chris Bizer <chris@bizer.de>
Message-Id: <DD101E67-397C-404F-9FAE-A4CF0947E0D6@ebusiness-unibw.org>
To: Juan Sequeda <juanfederico@gmail.com>
Hi Juan,

> Can somebody point me to papers or maybe give their definition of  
> low quality data when it comes to LOD. What is the criteria for data  
> to be considered low quality.

besides Chris' PhD thesis, in particular pp. 10-35, I suggest standard  
IQ / DQ works, e.g.

• Erhard Rahm and Hong Hai Do (2000): Data Cleaning: Problems and  
Current Approaches, IEEE Data Eng. Bull. 23(4): 3-13 (2000)
• Wayne W. Eckerson (2002): Data Quality and the Bottom Line, Report,  
The Data Warehousing Institute
• Wand, Y. and Wang, R. (1996): Anchoring Data Quality Dimensions in  
Ontological Foundations, Communications of the ACM, November 1996. pp.  
86–95.
• Wang, R., Kon, H. & Madnick, S. (1993): Data Quality Requirements  
Analysis and Modelling, Ninth International Conference of Data  
Engineering, Vienna, Austria.

You may also look at the recent work done by Christian Fuerber, a PhD  
student of mine, in particular:

Fürber, Christian and Hepp, Martin: Using SPARQL and SPIN for Data  
Quality Management on the Semantic Web, in: BIS 2010. Proceedings of  
the 13th International Conference on Business Information Systems, May  
3-5, 2010, Berlin, Germany, Springer LNBIP Vol 47, pp. 35-46.
PDF: http://www.heppnetz.de/files/fuerber-hepp-sparql-spin-dqm.pdf

Fürber, Christian and Hepp, Martin: Using Semantic Web Resources for  
Data Quality Management, in: Proceedings of the 17th International  
Conference on Knowledge Engineering and Knowledge Management  
(EKAW2010), October 11-15, 2010, Lisbon, Portugal, Springer LNCS Vol.  
6317, pp.  211-225, 2010.
PDF: http://www.heppnetz.de/files/dataquality-semweb-ekaw2010.pdf


Martin
Received on Monday, 25 October 2010 13:24:07 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:29 UTC