Re: LOD Data Sets, Licensing, and AWS

Leigh Dodds wrote:
> Hi,
> 2009/6/24 Kingsley Idehen <>:
>> When you publish said data as Linked Data you will be using an HTTP URI, and
>> in doing so there is implicit attribution.
>> If you retain the URIs of the source, or make explicit claims (e.g.,
>> dc:source) that expose the original data sources then everything is fine,
>> nobody along the value chain gets dislocated.
> Yes, with respect to linking back to the originating *dataset* I
> basically agree with you. I'd read your original comments as
> suggesting that simply reusing the core data was sufficient, and I
> think we're agreeing that the source (i.e. the void dataset) needs to
> be acknowledged.
> However this simply provides a means for citing sources, there are
> other aspects to attribution that also need to be addressed, e.g. how
> its actually surfaced to a user. E.g. what properties are included in
> the Void description of a dataset that might be included in a user
> interface. There's also the protocol level issues, e.g. how do we
> include links from SPARQL results?
If you site sources using HTTP URIs then the Linked Data meme's implicit 
association of Entity ID and Entity Metadata kicks in i.e., you have a 
and persistent pointer to the origins of any piece of data exposed by an 
given Data Space (which is the same thing as a *Dataspace*).

>> Ted Nelson: referred to the above in different terms as: Transcopyright.
> AIUI Transcopyright is a default licensing scheme for content (and
> presumably data) that encourages a share-alike behaviour rather than
> the current default "all rights reserved" copyright situation we have
> now. So related but not exactly the same.

My comments are still fundamentally about my preference for CC-BY-SA.  
Hence the transcopyright reference :-)

I want Linked Data to have its GPL equivalent; a license scheme that:

1.  protects the rights of data contributors;
2.  easy to express;
3.  easy to adhere to;
4.  easy to enforce.

As I stated during one of the Semtech 2009 sessions. HTTP URIs provide a 
closed loop re. the above. When you visit my data space you leave your 
fingerprints in my HTTP logs. I can follow the log back to your 
resources to see if you are conforming with my terms. I can compare the 
data in your resource against my and sniff out if you are attributing 
your data sources (what you got from me) correctly.

If all the major media companies grok the above, there will be far less 
resistance to publishing linked data since they will actually have 
better comprehension of its inherent virtues and positive impact on 
their bottom line.

HTTP URIs are potent mediums of value exchange, and for media companies 
they will come to understand that their crown jewels (think "data 
wine")  simply needs to be served up in cyberspace via HTTP URIs instead 
of "paper cups" or electronic renditions of "paper cups"  (e.g. 
newspaper industry).
>> He also used the term: Transclusion, to describe what we commonly refer to
>> as: mashups (Web 2.0 code hacks) and meshups (Linked Data emixes), today.
> He'd probably argue differently, I've seen him speak and he's an
> interesting character! :). But yes, the essence is the same.
It's always important to apply temporal context to Ted's comments. The 
guy groked today's Linked Data meme circa. 1965 (or slightly earlier). 
He always espoused interwingularity and hyper-orthogonality of data 
based on inherent irregularity of data structures across Data Spaces / 
Dataspaces.  Xanadu, ZigZag are profound visionary insights that are all 
doable today via applications of Linked Data and RDFa.

> L.



Kingsley Idehen	      Weblog:
President & CEO 
OpenLink Software     Web:

Received on Wednesday, 24 June 2009 15:06:40 UTC