Re: The Public Domain (was Re: LOD Data Sets, Licensing, and AWS) from Kingsley Idehen on 2009-06-25 (public-lod@w3.org from June 2009)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Wed, 24 Jun 2009 21:59:45 -0400
To: Ian Davis <lists@iandavis.com>
CC: Alan Ruttenberg <alanruttenberg@gmail.com>, Leigh Dodds <leigh.dodds@talis.com>, public-lod@w3.org
Message-ID: <4A42DA11.10107@openlinksw.com>
Ian Davis wrote:
> On Wed, Jun 24, 2009 at 9:56 PM, Kingsley Idehen 
> <kidehen@openlinksw.com <mailto:kidehen@openlinksw.com>> wrote:
>
>     The NYT, London Times, and others of this ilk, are more likely to
>     contribute their quality data to the LOD cloud if they know there
>     is a vehicle (e.g., a license scheme) that ensures their HTTP URIs
>     are protected i.e., always accessible to user agents at the data
>     representation (HTML, XML, N3, RDF/XML, Turtle etc..) level;
>     thereby ensuring citation and attribution requirements are honored.
>
>
> I agree with that, but it only covers a small portion of what is 
> needed. You fail to consider the situations where people publish data 
> about other people's URIs, as reviews or annotation.
I am not, far from it.
> The foaf:primaryTopic mechanism isn't strong enough if the publisher 
> requires full attribution for use of their data. If I use SPARQL to 
> extract a subset of reviews to display on my site then in all 
> likelihood I have lost that linkage with the publishing document.
Only if you choose to construct your result document using literal 
values i.e., a SPARQL solution that has URIs filtered out;  anyway, if 
thats what you end up doing, then you do have <link/> and @rel at your 
disposal for identifying your data sources, worst case.

>
>  
>
>     Attribution is the kind of thing one gives as the result of a
>     license requirement in exchange for permission to copy. In the
>     academic world for journal articles this doesn't come into play at
>     all, since there is no copying (in the usual case). Instead people
>     cite articles because the norms of their community demand it.
>
>     Yes, and the HTTP URI ultimately delivers the kind mechanism I
>     believe most traditional media companies seek (as stated above).
>     They ultimately want people to use their data with low cost
>     citation and attribution intrinsic to the medium of value exchange.
>
>
> The BBC is a traditional media company. Its data is licensed only for 
> personal, non-commercial use: http://www.bbc.co.uk/terms/#3
I used New York Times and London Times for specific reasons, their 
business models are different from that of the BBC; they are traditional 
*commercial* media companies.
>  
>
>     btw - how are you dealing with this matter re. the
>     nuerocommons.org <http://nuerocommons.org> linked data space? How
>     do you ensure your valuable work is fully credited as it bubbles
>     up the value chain?
>
>
> I found this linked from the RDF Distribution page on neurocommons.org 
> <http://neurocommons.org> : 
> http://svn.neurocommons.org/svn/trunk/product/bundles/frontend/nsparql/NOTICES.txt
>
> Everyone should read it right now to appreciate the complexity of 
> aggregating data from many sources when they all have idiosyncratic 
> requirements of attribution.
>
> Then read 
> http://sciencecommons.org/projects/publishing/open-access-data-protocol/ 
> to see how we should be approaching the licensing of data. It explains 
> in detail the motivations for things like CC-0 and PDDL which seek to 
> promote open access for all by removing restrictions:
>
> "Thus, to facilitate data integration and open access data sharing, 
> any implementation of this protocol MUST waive all rights necessary 
> for data extraction and re-use (including copyright, sui generis 
> database rights, claims of unfair competition, implied contracts, and 
> other legal rights), and MUST NOT apply any obligations on the user of 
> the data or database such as “copyleft” or “share alike”, or even the 
> legal requirement to provide attribution. Any implementation SHOULD 
> define a non-legally binding set of citation norms in clear, 
> lay-readable language."
>
> Science Commons have spent a lot of time and resources to come to this 
> conclusion, and they tried all kinds of alternatives such as 
> attribution and share alike licences (as did Talis). The final 
> consensus was that the public domain was the only mechanism that could 
> scale for the future. Without this kind of approach, aggregating, 
> querying and reusing the web of data will become impossibly complex. 
> This is a key motivation for Talis starting the Connected Commons 
> programme ( http://www.talis.com/platform/cc/ ). We want to see more 
> data that is unambiguously reusable because it has been placed in the 
> public domain using CC-0 or the Open Data Commons PDDL.
>
> So, I urge everyone publishing data onto the linked data web to 
> consider waiving all rights over it using one of the licenses above.
I don't think "waiving all rights" is a practical option for the likes 
of New York Times or Times of London, ditto traditional commercial media 
companies.
> As Kingsley points out, you will always be attributed via the URIs you 
> mint.
This part I totally agree with :-)

>
> Ian
>
> PS. This was the subject of my keynote at code4lib 2009 "If you love 
> something, set it free", which you can view here 
> http://www.slideshare.net/iandavis/code4lib2009-keynote-1073812
>
>
The thing about "Free" is that we'll always end up having to 
disambiguate: "Free Speech" and "Free Beer". That's the sad nature of 
the overloaded "Free" moniker that belies the Open Source moniker.

-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Thursday, 25 June 2009 02:00:30 UTC