- From: Leigh Dodds <leigh.dodds@talis.com>
- Date: Fri, 22 Oct 2010 16:15:55 +0100
- To: Linking Open Data <public-lod@w3.org>
Hi, The LOD cloud analysis [1] is a really great piece of work. I wanted to pick up on one aspect of the analysis for further discussion: whether data is published by the data owner or a third-party. It seems to me that there are broadly three categories into which a dataset might fall: * Primary -- published and maintained directly by the data owner, e.g. BBC * Secondary -- published and maintained by a third-party, e.g. by scraping, wrapping or otherwise converting a data source * Tertiary -- published and maintained by a third-party, usually a mirror or aggregation of primary/secondary sources. This might be a direct mirror, or involve some additional creativity, e.g. re-modelling some aspects of another dataset. Mirrors typically provide additional services, e.g. a SPARQL endpoint where primary source doesn't provide one. If we consider the different categories we can see that: * Growth of the web of data is best served by encouraging more Primary sources. The current community can't scale to add more Secondary sources, so adoption is best driven by data owners * Sustainability and usage of Linked Data is best served by encouraging more Tertiary sources. Availability of useful, current aggregations of data, wrapped in services will help drive more consumption. What do others think? Cheers, L. [1]. http://www4.wiwiss.fu-berlin.de/lodcloud/state/ -- Leigh Dodds Programme Manager, Talis Platform Talis leigh.dodds@talis.com http://www.talis.com
Received on Friday, 22 October 2010 15:16:34 UTC