- From: Aidan Hogan <aidan.hogan@deri.org>
- Date: Tue, 21 May 2013 20:14:16 +0100
- To: public-lod@w3.org
<snip> On 18/05/2013 09:58, Leigh Dodds wrote: > You don't say in your paper how you did the analysis. Did you use the > metadata from the LOD group in datahub? At the time I had to do > mine manually, but it wouldn't be hard to automate some of this now, > perhaps to create an regularly updated set of indicators. > > One criteria that agents might apply when conducting "Follow Your > Nose" consumption of Linked Data is the licensing of the target data, > e.g. ignore links to datasets that are not licensed for your > particular usage. On a similar note, we also did a survey of some licensing issues in and around Linked Data as part of a larger contribution looking at how closely publishers of RDF follow various tips from the (now superseded but still relevant) "How to Publish Linked Data on the Web" guide [1]. Our analysis is published/available at [2,3]. For the paper, we looked at ~4 million RDF/XML documents crawled in May 2011, divided the data by pay-level domain and looked at how well each domain followed the key guidelines in [1] with the goal of seeing how well specific guidelines are followed, and looking to comparatively rank the conformance of publishers using objective measures. We ended up looking at 188 domains that offered more than 1,000 quads. Long story shortish, for one of the guidelines we looked specifically at licensing information for documents embedded in the documents themselves [p29,2]. This was tricky: we found a bunch of licensing properties in use [Table 19,2]. Considering as many of these properties as we could identify, we found that only 15% of the domains provided licensing information embedded in *at least one* local document. Averaging equally across the domains (which had different numbers of documents), about 3% of documents contained observable licensing information about themselves. On the plus side, there was some use of the creative-commons vocabulary: http://creativecommons.org/ns ... though I think dct:rights/dct:license are more actively promoted. Versus registering the licensing information on the DataHub or so forth (which AFAIK no longer supports a public SPARQL endpoint), it would be much better for (SemWeb) consumers if publishers directly embed licensing meta-data in the individual RDF documents themselves. There are already established vocabularies and (at least CC) license URIs in place for this. Cheers/fwiw, Aidan [1] http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/LinkedDataTutorial/ [2] http://sw.deri.org/~aidanh/docs/ldstudy12.pdf [3] Aidan Hogan, Jürgen Umbrich, Andreas Harth, Richard Cyganiak, Axel Polleres and Stefan Decker. "An empirical survey of Linked Data conformance ". In the Journal of Web Semantics 14: pp. 14–44, 2012.
Received on Tuesday, 21 May 2013 19:14:45 UTC