Re: LOD Data Sets, Licensing, and AWS

From: Ian Davis <lists@iandavis.com>
Date: Tue, 23 Jun 2009 23:33:56 +0100
Message-ID: <ec8613a80906231533m1c7c9e75k2572384d9afa85cc@mail.gmail.com>
To: Kingsley Idehen <kidehen@openlinksw.com>
Cc: public-lod@w3.org
On Tue, Jun 23, 2009 at 11:11 PM, Kingsley Idehen <kidehen@openlinksw.com>wrote:

> Ian Davis wrote:
>> Hi all,
>> On Tue, Jun 23, 2009 at 9:36 PM, Kingsley Idehen <kidehen@openlinksw.com<mailto:
>> kidehen@openlinksw.com>> wrote:
>>    All,
>>    As you may have noticed, AWS still haven't made the LOD cloud data
>>    sets  -- that I submitted eons ago -- public. Basically, the
>>    hold-up comes down to discomfort with the lack of license clarity
>>    re. some of the data sets.
>>    Action items for all data set publishers:
>>    1. Integrate your data set licensing into your data set (for LOD I
>>    would expect CC-BY-SA to be the norm)
>> Please do not use CC-BY-SA for LOD - it is not an appropriate licence and
>> it is making the problem worse. That licence uses copyright which does not
>> hold for factual information.
>> Please use an Open Data Commons license or CC-0
>> http://www.opendatacommons.org/licenses/
>> http://wiki.creativecommons.org/CC0
>> If your dataset contains copyrighted material too (e.g. reviews) and you
>> hold the rights over that content then you should also apply a standard
>> copyright licence. So for completeness you need a licence for your data and
>> one for your content. If you use CC-0 you can apply it to both at the same
>> time. Obviously if you aren't the rightsholder (e.g. it is scraped
>> data/content from someone else) then you can't just slap any licence you
>> like on it - you have to abide by the original rightsholder's wishes.
>> Personally I would try and select a public domain waiver or dedication,
>> not one that requires attributon. The reason can be seen at
>> http://en.wikipedia.org/wiki/BSD_license#UC_Berkeley_advertising_clausewhere stacking of attributions becomes a huge burden. Having datasets
>> require attribution will negate one of the linked data web's greatest
>> strengths: the simplicity of remixing and reusing data.
> Ian,
> Using licensing to ensure the data providers URIs are always preserved
> delivers low cost and implicit attribution. This is what I believe CC-BY-SA
> delivers. There is nothing wrong with granular attribution if compliance is
> low cost. Personally, I think we are on the verge of an "Attribution
> Economy", and said economy will encourage contributions from a plethora of
> high quality data providers (esp. from the tradition media realm).

I don't think usage of a URI is enough for attribution because a URI is not
information bearing. Of course I could dereference it and perhaps obtain
some triples that use it, but that URI does not denote those triples or that
document. There will be dozens or hundreds of other documents that use the
same URI and the owners of those datasets would like attribution for their
work. For example, I can make some unique assertions about you that no-one
else has and I would like those attributed to me - using your URI would not
provide that attribution.

> Anyway, each data set provider should pick the license that works for them
> :-)

Yes I agree. The above paragraph was my personal preference, but I'd like to
convince others to think like me :)

