The Public Domain (was Re: LOD Data Sets, Licensing, and AWS)

On Wed, Jun 24, 2009 at 9:56 PM, Kingsley Idehen <kidehen@openlinksw.com>wrote:

> The NYT, London Times, and others of this ilk, are more likely to
> contribute their quality data to the LOD cloud if they know there is a
> vehicle (e.g., a license scheme) that ensures their HTTP URIs are protected
> i.e., always accessible to user agents at the data representation (HTML,
> XML, N3, RDF/XML, Turtle etc..) level; thereby ensuring citation and
> attribution requirements are honored.


I agree with that, but it only covers a small portion of what is needed. You
fail to consider the situations where people publish data about other
people's URIs, as reviews or annotation. The foaf:primaryTopic mechanism
isn't strong enough if the publisher requires full attribution for use of
their data. If I use SPARQL to extract a subset of reviews to display on my
site then in all likelihood I have lost that linkage with the publishing
document.



> Attribution is the kind of thing one gives as the result of a license
> requirement in exchange for permission to copy. In the academic world for
> journal articles this doesn't come into play at all, since there is no
> copying (in the usual case). Instead people cite articles because the norms
> of their community demand it.
>
Yes, and the HTTP URI ultimately delivers the kind mechanism I believe most
> traditional media companies seek (as stated above). They ultimately want
> people to use their data with low cost citation and attribution intrinsic to
> the medium of value exchange.
>

The BBC is a traditional media company. Its data is licensed only for
personal, non-commercial use: http://www.bbc.co.uk/terms/#3


> btw - how are you dealing with this matter re. the nuerocommons.org linked
> data space? How do you ensure your valuable work is fully credited as it
> bubbles up the value chain?
>

I found this linked from the RDF Distribution page on neurocommons.org :
http://svn.neurocommons.org/svn/trunk/product/bundles/frontend/nsparql/NOTICES.txt

Everyone should read it right now to appreciate the complexity of
aggregating data from many sources when they all have idiosyncratic
requirements of attribution.

Then read
http://sciencecommons.org/projects/publishing/open-access-data-protocol/ to
see how we should be approaching the licensing of data. It explains in
detail the motivations for things like CC-0 and PDDL which seek to promote
open access for all by removing restrictions:

"Thus, to facilitate data integration and open access data sharing, any
implementation of this protocol MUST waive all rights necessary for data
extraction and re-use (including copyright, sui generis database rights,
claims of unfair competition, implied contracts, and other legal rights),
and MUST NOT apply any obligations on the user of the data or database such
as “copyleft” or “share alike”, or even the legal requirement to provide
attribution. Any implementation SHOULD define a non-legally binding set of
citation norms in clear, lay-readable language."

Science Commons have spent a lot of time and resources to come to this
conclusion, and they tried all kinds of alternatives such as attribution and
share alike licences (as did Talis). The final consensus was that the public
domain was the only mechanism that could scale for the future. Without this
kind of approach, aggregating, querying and reusing the web of data will
become impossibly complex. This is a key motivation for Talis starting the
Connected Commons programme ( http://www.talis.com/platform/cc/ ). We want
to see more data that is unambiguously reusable because it has been placed
in the public domain using CC-0 or the Open Data Commons PDDL.

So, I urge everyone publishing data onto the linked data web to consider
waiving all rights over it using one of the licenses above. As Kingsley
points out, you will always be attributed via the URIs you mint.

Ian

PS. This was the subject of my keynote at code4lib 2009 "If you love
something, set it free", which you can view here
http://www.slideshare.net/iandavis/code4lib2009-keynote-1073812

Received on Wednesday, 24 June 2009 23:40:58 UTC