Re: attaching multiple licenses from Keith Alexander on 2009-12-09 (public-lod@w3.org from December 2009)

From: Keith Alexander <keithalexander@keithalexander.co.uk>
Date: Wed, 09 Dec 2009 11:44:46 -0000
To: "Toby Inkster" <tai@g5n.co.uk>, "Georgi Kobilarov" <georgi.kobilarov@gmx.de>
Cc: public-lod@w3.org
Message-ID: <op.u4n5kwho63ayaz@keith-alexanders-macbook.local>

On Mon, 07 Dec 2009 06:40:28 -0000, Toby Inkster <tai@g5n.co.uk> wrote:

>
> A better solution might be to publish your data in a format that can
> make use of multiple graphs. e.g. in N3:
>

> Unfortunately, most of the data formats with native support for named
> graphs do not have very good support in consuming software. But you can

Could one option be to use Toby and Kjetil's Named Graphs in RDFa (“RDFa  
Quads”)[1] ? Then at least the DOM (and perhaps the browser's rendering of  
it) can make explicit what data comes from where, and how it is licensed,  
while regular RDFa triple parsers can still get at the triples unimpeded.  
You could attach a rights statement to the document explaining what you've  
done, and how subsequent consumers should take notice of the relevant  
licenses as appropriate.

It doesn't make it easy for hypothetical use cases of license-sensitive   
consumption & republication, but I suppose the answer to this thorny  
problem is social (get data publishers to forgo placing these  
license-hurdles on their would-be data-consumers?) rather than technical?

It just seems that, if you were to produce some kind of dynamic  mesh-up  
application that consumed data from arbitrary sources, it would be quite  
hard to design it to do the "right" thing with regards to  
licenses/waivers/copyright.

Lots of linked data is published with no explicit rights statement - do  
they all need one? Do I need one for my foaf file?
Lots of data is published with licenses not applicable to data (only to  
'creative works') - should the license be ignored, or the publisher's  
intentions be respected ?
If so, what (as Georgi is asking) are the acceptable ways of republishing   
data merged from different sources with different rights requirements?   
I'm not sure that Tom's solution to not duplicate triples in the RDF view  
is wholly sufficient because the licenses still ask for attribution and  
license reproduction, which still apply however you republish the data,  
don't they? Also, as has already been said, the merging can involve useful  
data cleaning/smushing/etc work that deserves republication.

Do licenses requiring  attribution give any guidance about how that  
attribution should be given in a mash-up/mesh-up scenario?

How SHOULD attribution be given? I have seen Kingsley argue a few times  
that simply reusing a dbpedia URI is sufficient attribution to dbpedia.  
While this is appealing, is this the intention of the data publishers (if  
you asked wikipedia and/or dbpedia and/or freebase, would they agree?) ?   
Moreover, doesn't this mean you HAVE to mint your own URIs if you want  
attribution for data you are giving about something ? (ie, if I publish  
some additional facts about dbpedia:Berlin, if you republish them, the  
dbpedia:Berlin URI does not give me any attribution.)

All this confusion about how linked data can be used is really strange  
given that RDF is explicitly designed as a model for distributed data that  
can easily be merged. It wouldn't seem totally crazy to think the default  
assumption would be that if someone publishes RDF, they want people to  
reuse it and merge it with other RDF...

Anyway, I think that any publishers of linked data that want to place some  
restrictions on how their data is reused, should provide clearer  
guidelines on the technical solutions to fulfilling those requirements.  
Maybe if they thought about it and realised how confusing it all is,  
they'd decide they didn't really want to restrict reuse that much in the  
first place.

Keith

[1] http://buzzword.org.uk/2009/rdfa4/spec

Received on Wednesday, 9 December 2009 11:58:00 UTC