Re: New LOD ESW wikipage about Data Licensing from Kingsley Idehen on 2010-09-28 (public-lod@w3.org from September 2010)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Mon, 27 Sep 2010 21:44:24 -0400
To: Vasiliy Faronov <vfaronov@gmail.com>
CC: Marc Wick <marc@geonames.org>, public-lod@w3.org
Message-ID: <4CA14878.90606@openlinksw.com>
  On 9/27/10 4:37 PM, Vasiliy Faronov wrote:
> Hi Marc,
>
>> It is the right of the data provider to determine what the data may be
>> used for.
> To an extent allowed by the copyright / database right law.
>
>> I would estimate for most datasets on the lod diagram you would not be
>> allowed to do what you describe. Most have some 'by' restriction so
>> you would at least have to give credit to the providers of the dataset
>> if you want to use it. Luckily this is a restriction pretty easy to
>> comply with.
> If your estimate is correct, then basically Google would be illegal,
> I figure. They gather data from the Web (even store it, which my
> hypothetical app doesn't), process it, and display it to the user in
> some form. They don't give any special credit, save for the links
> themselves.
>

Yep! And in these links lies "attribution". I think Google is a 
fantastic example of a strangely overlooked aspect of the LINK in Linked 
Data.

URI abstraction provides powerful branding, imprint, and attribution all 
in one. Original source URIs keep data providers in the value chain 
forever. This is why literal attribution is no good, its always why 
protection should focus primarily on preserving data source imprints.

Example (I've given this in the past). The following forms of 
attribution aren't equivalent:

1. A page that provides information about London with the "Powered by 
DBpedia" displayed on a Web Page
2. http://dbpedia.org/resource/London -- Entity Name associated with a 
rich Structured Linked Data Source that provides access to data about 
London .
> RSS aggregators would be illegal, for the same reasons.

Yep!
> I believe this issue needs clarification. Licensing is a serious issue,
> but I don't think we can make it a "must have" for proper LD serving.
Correct!

Google has indexed and served information on the Web forever, ditto all 
the other search engines. The Web Resources in question happen to be 
HTML (semantically challenged at the data end but strong on the display 
and presentation side of things). In the case of the Blogosphere, we are 
looking at the same thing with XML based Web Resources (semantically 
challenged re. data aspect, but strong re. semantics for content 
structure that enabled separation from formatting etc).

> Imagine a "normal" web developer cautiously trying to add a bit of RDFa
> to their company's web site. Now we come along and tell them that they
> must also indicate a license. But they don't have a special license for
> their web content, as they have never needed it. Their reaction? They
> just abandon LD altogether.

More important question, why did the organization in question publish 
content to the Web? If it wasn't to be accessed then why bother?

The real issue publishers have is this: taking their content and 
re-purposing it under new URIs without any reference to the actual 
origins of the content (and the data it carries).

If Google, Yahoo!, Bing!, and all the others can aggregate and provide 
search services that cough up Web Resource URIs, the same applies fine 
re. Linked Data.

The most important point (as I see it) is this: don't import data from a 
source, and then rebrand as yours by changing the URIs. That's simply 
wrong! Always refer back to your sources. Basic practice that's been 
established in the real world for a very long time.

-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Tuesday, 28 September 2010 01:44:55 UTC