Re: LOD Data Sets, Licensing, and AWS

Ian Davis wrote:
> On Tue, Jun 23, 2009 at 11:11 PM, Kingsley Idehen 
> < <>> wrote:
>     Ian Davis wrote:
>         Hi all,
>         On Tue, Jun 23, 2009 at 9:36 PM, Kingsley Idehen
>         < <>
>         <
>         <>>> wrote:
>            All,
>            As you may have noticed, AWS still haven't made the LOD
>         cloud data
>            sets  -- that I submitted eons ago -- public. Basically, the
>            hold-up comes down to discomfort with the lack of license
>         clarity
>            re. some of the data sets.
>            Action items for all data set publishers:
>            1. Integrate your data set licensing into your data set
>         (for LOD I
>            would expect CC-BY-SA to be the norm)
>         Please do not use CC-BY-SA for LOD - it is not an appropriate
>         licence and it is making the problem worse. That licence uses
>         copyright which does not hold for factual information.
>         Please use an Open Data Commons license or CC-0
>         If your dataset contains copyrighted material too (e.g.
>         reviews) and you hold the rights over that content then you
>         should also apply a standard copyright licence. So for
>         completeness you need a licence for your data and one for your
>         content. If you use CC-0 you can apply it to both at the same
>         time. Obviously if you aren't the rightsholder (e.g. it is
>         scraped data/content from someone else) then you can't just
>         slap any licence you like on it - you have to abide by the
>         original rightsholder's wishes.
>         Personally I would try and select a public domain waiver or
>         dedication, not one that requires attributon. The reason can
>         be seen at
>         where stacking of attributions becomes a huge burden. Having
>         datasets require attribution will negate one of the linked
>         data web's greatest strengths: the simplicity of remixing and
>         reusing data.
>     Ian,
>     Using licensing to ensure the data providers URIs are always
>     preserved delivers low cost and implicit attribution. This is what
>     I believe CC-BY-SA delivers. There is nothing wrong with granular
>     attribution if compliance is low cost. Personally, I think we are
>     on the verge of an "Attribution Economy", and said economy will
>     encourage contributions from a plethora of high quality data
>     providers (esp. from the tradition media realm).
> I don't think usage of a URI is enough for attribution because a URI 
> is not information bearing.
> Of course I could dereference it and perhaps obtain some triples that 
> use it, but that URI does not denote those triples or that document.
An HTTP URI (as used re. Linked Data meme) carries implicit attribution 
prowess by implicitly binding the thing it identifies to its metadata 
(very data bearing). This is what makes this URI type so potent when 
dealing with data publishing and data access.
> There will be dozens or hundreds of other documents that use the same 
> URI and the owners of those datasets would like attribution for their 
> work. For example, I can make some unique assertions about you that 
> no-one else has and I would like those attributed to me - using your 
> URI would not provide that attribution.

But your URIs conveys your point of view. The important thing here is 
that their is a route back to your data space; the place from which your 
point of view originates.

If the pathways to the origins of data are obscured we are recreating 
yesterday's economy (imho), one in which original creators of work as 
easily dislocated by middlemen. An economy in which incentives for data 
publishing are minimal for those who have invested time and money in 
quality data curation and maintenance.

>     Anyway, each data set provider should pick the license that works
>     for them :-)
> Yes I agree. The above paragraph was my personal preference, but I'd 
> like to convince others to think like me :)

Ditto :-)
> Ian



Kingsley Idehen	      Weblog:
President & CEO 
OpenLink Software     Web:

Received on Wednesday, 24 June 2009 01:01:15 UTC