Re: There's No Money in Linked Data from Pascal Hitzler on 2013-06-01 (public-lod@w3.org from June 2013)

From: Pascal Hitzler <pascal.hitzler@wright.edu>
Date: Sat, 1 Jun 2013 10:45:20 -0400
To: Aidan Hogan <aidan.hogan@deri.org>
CC: <public-lod@w3.org>
Message-ID: <51AA0900.1040006@wright.edu>
Thanks, Aidan - sorry for missing your analysis.

In fact, with the ensuing discussions it turns out that there is even 
more information out there, it's just extremely hard to find. Another 
reason why we need this discussion, and probably some concerted effort.

Pascal.

On 5/21/2013 3:14 PM, Aidan Hogan wrote:
> <snip>
> On 18/05/2013 09:58, Leigh Dodds wrote:
>> You don't say in your paper how you did the analysis. Did you use the
>> metadata from the LOD group in datahub? At the time I had to do
>> mine manually, but it wouldn't be hard to automate some of this now,
>> perhaps to create an regularly updated set of indicators.
>>
>> One criteria that agents might apply when conducting "Follow Your
>> Nose" consumption of Linked Data is the licensing of the target data,
>> e.g. ignore links to datasets that are not licensed for your
>> particular usage.
>
> On a similar note, we also did a survey of some licensing issues in and
> around Linked Data as part of a larger contribution looking at how
> closely publishers of RDF follow various tips from the (now superseded
> but still relevant) "How to Publish Linked Data on the Web" guide [1].
>
> Our analysis is published/available at [2,3]. For the paper, we looked
> at ~4 million RDF/XML documents crawled in May 2011, divided the data by
> pay-level domain and looked at how well each domain followed the key
> guidelines in [1] with the goal of seeing how well specific guidelines
> are followed, and looking to comparatively rank the conformance of
> publishers using objective measures. We ended up looking at 188 domains
> that offered more than 1,000 quads.
>
> Long story shortish, for one of the guidelines we looked specifically at
> licensing information for documents embedded in the documents themselves
> [p29,2]. This was tricky: we found a bunch of licensing properties in
> use [Table 19,2]. Considering as many of these properties as we could
> identify, we found that only 15% of the domains provided licensing
> information embedded in *at least one* local document. Averaging equally
> across the domains (which had different numbers of documents), about 3%
> of documents contained observable licensing information about themselves.
>
> On the plus side, there was some use of the creative-commons vocabulary:
>
>      http://creativecommons.org/ns
>
> ... though I think dct:rights/dct:license are more actively promoted.
>
>
>
> Versus registering the licensing information on the DataHub or so forth
> (which AFAIK no longer supports a public SPARQL endpoint), it would be
> much better for (SemWeb) consumers if publishers directly embed
> licensing meta-data in the individual RDF documents themselves. There
> are already established vocabularies and (at least CC) license URIs in
> place for this.
>
>
> Cheers/fwiw,
> Aidan
>
>
>
>
> [1]
> http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/LinkedDataTutorial/
> [2] http://sw.deri.org/~aidanh/docs/ldstudy12.pdf
> [3] Aidan Hogan, Jürgen Umbrich, Andreas Harth, Richard Cyganiak, Axel
> Polleres and Stefan Decker. "An empirical survey of Linked Data
> conformance ". In the Journal of Web Semantics 14: pp. 14–44, 2012.
>
>
>
>
>

-- 
Prof. Dr. Pascal Hitzler
Kno.e.sis Center, Wright State University, Dayton, OH
pascal@pascal-hitzler.de   http://pascal-hitzler.de/
Semantic Web Textbook: http://www.semantic-web-book.org/
Semantic Web Journal: http://www.semantic-web-journal.net/
Received on Saturday, 1 June 2013 14:45:44 UTC