W3C home > Mailing lists > Public > public-dwbp-comments@w3.org > January 2017

Re: Comments (with use case)

From: Phil Archer <phila@w3.org>
Date: Thu, 5 Jan 2017 11:22:23 +0000
To: Hugh Glaser <hugh@glasers.org>, public-dwbp-comments@w3.org, POE Comment list <public-poe-comments@w3.org>
Message-ID: <20be81cb-6b4b-87d4-89db-935c78bf3d22@w3.org>
+ POE WG

Thanks again for posting this, Hugh.

Now that I've actually read and thought about what you've said here, I 
think DWBP covers the topic as far as it will. BP4 [1] simply stresses 
the importance of including licence info. In the NYT example, yes, 
different licence info applies to different subsets, but the high level 
advice in the DWBP doc still obtains.

*However*

I think your use case is very pertinent to the Permissions and 
Obligations Expression WG (hence adding the additional list). Their use 
case doc includes the kind of thing you're after - I hope. See 
http://w3c.github.io/poe/ucr/#POE.UC.06 and the following one (which is 
from the news industry).

As ever, I imagine that this will all come down to identifiers, subsets 
and, where relevant, named graphs, but the point about provenance and 
licensing being closely linked is well made.

Cheers

Phil

[1] https://www.w3.org/TR/dwbp/#DataLicense

On 29/12/2016 11:32, Hugh Glaser wrote:
> As suggested by Phil Archer (when I posted this to public-LOD), I am reposting here.
> (I read Hadley's Facebook post to mean that only W3C members could comment now.)
>
> Repost:
> https://www.w3.org/TR/dwbp/
>
> Hi.
> I have just seen a reference to this on Facebook, posted by Hadley - many thanks.
>
> I guess it is all too late (sorry!), but thought I would raise one issue, in case someone here feels they can to take it up.
> And it is sort of interesting for this list.
>
> As far as I can see (really sorry if I have missed it), there is no suggestion of splitting datasets for licence purposes.
> There is a bit on it in BP18 for different users and use cases.
>
> The use case I am thinking about is the NYT (New York Times) LD release, all those years ago.
> There was a bunch of data they had made into LD, and wanted to make it public; they also wanted to make the links that they had established to other datasets public.
> So they gathered it all together, and put it in one dataset, with the appropriate licence, etc..
> This would conform (if they did some more), with the Best Practices here.
>
> However, this is probably not the best thing for them.
> The basic dataset that they wanted to publish came with a bunch of licence restrictions - it is in some sense their treasure map, and they don't want to lose control of it.
> The linkage, on the other hand, is exactly the stuff they want people to take away and do whatever they like with - after all, it is the very information that people need to find their data in the dataset; in SEO terms, it is driving traffic to their site.
>
> (In my case, in very practical terms, I want to be able to harvest the owl:sameAs triples and put them in sameAs.org, safe in the knowledge that I am not violating any conditions.
> And, I think, the NYT very much wants me to do that, so that their dataset gets found.)
> In addition, in a related issue about splitting datasets, the provenance of the linkage is actually usually quite different from the provenance of the dataset. It may be that the linkage is the result of an intern spending the summer doing some work, whereas the rest of the dataset is in fact the result of decades of work (as was the case of the NYT).
>
> DBpedia very helpfully splits out this sort of data - not for licence reasons, I think (at least at the moment, although it might be the case that there should be different licences), but for convenience, with a very large dataset.
>
> An additional use case:
> Many lhe libraries of the world are making their catalogue subject data available. They have also established links between their catalogue and other catalogues. Using these links, I was able to build http://sameas.org/store/kelle/ , which enables the closures of quite a few of the catalogue equivalences.
> The libraries were all very happy to give me this linkage information - had this information been bundled up with the catalogue data, the process of allowing me free use would have been much more problematic, and indeed I might not have got any data.
>
> So, is there any scope for comments somewhere about this?
> I think it would be a great if the idea of providing linkage with a separate licence (even if it is in the same physical distribution of the dataset) could be included.
>
> Best, and season's greetings to you all.
>
> Hugh
>

-- 


Phil Archer
Data Strategist, W3C
http://www.w3.org/

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Thursday, 5 January 2017 11:22:33 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 5 January 2017 11:22:34 UTC