Re: Recommendations: specificity from Antoine Isaac on 2011-03-30 (public-lld@w3.org from March 2011)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Wed, 30 Mar 2011 15:51:40 +0200
To: public-lld <public-lld@w3.org>
Message-ID: <4D93356C.7070309@few.vu.nl>
>> We need to persuade publishers of vocabularies in our sector that the advent of LD brings
>> with it a responsibility to re-publish in LD format, so that their users can get LD value from
>> all the investment those users have made in using that vocabulary. This doesn't need to
>> mean that the publisher loses all control over their investment.
>
> Somewhat oddly, I suspect that one could make an argument that minting URIs and publishing Linked Data is actually a way of protecting the publishers' investment (because, otherwise, publishers will see their vocabularies begin to surface at multiple other places on the web, some of which will likely gain some traction).
>
> Not sure how easy it will be to persuade people of this though ;-)


That's interesting, Andy, and I think could be fit with Richard's reminding that data publishers don't need to publish all data for their URIs. A recommendation for them (in relation to the "a discussion about open data and rights" header on the wiki page) could be to find out how to separate between their simpler data that can be fully open, and more complex data that could be protected.

The balance will be more difficult to determine than it was when e.g. Getty took the decision to put AAT pages freely accessible on the web, as in the new situation we have data on the one hand and data on the other hand. But it seems an important issue to tackle if they want to protect their investment, as you say :-)

Antoine


>
> Andy
>
> --
> Andy Powell
> Research Programme Director
> Eduserv
> t: 01225 474319
> m: 07989 476710
> twitter: @andypowe11
> blog: efoundations.typepad.com
>
> www.eduserv.org.uk
>
>
> -----Original Message-----
> From: public-lld-request@w3.org [mailto:public-lld-request@w3.org] On Behalf Of Richard Light
> Sent: 30 March 2011 11:21
> To: Karen Coyle
> Cc: public-lld
> Subject: Re: Recommendations: specificity
>
> In message<20110329100006.202654whce01ua6u@kcoyle.net>, Karen Coyle<kcoyle@kcoyle.net>  writes
>>
>> LCSH is done, but Dewey is only available on a limited basis because
>> there are contractual constraints. Aside from that, though, one of the
>> issues that I see here is that many of these vocabularies are "owned"
>> by single institutions and therefore we are dependent on those
>> institutions to issue them in RDF. Out of some frustration about this,
>> both Ross Singer and I have independently done some work on MARC
>> vocabularies. And look at what has happened with FRBR, which was not
>> provided by its "creator" body until many years after others had done
>> so. This is not a rant against those institutions but a real problem
>> that we need to deal with. Can we find a way to "communalize" more of
>> these vocabularies so that they can be converted in a more agile manor?
>
> I agree it's a hard problem.  In my sector, the Getty vocabularies (AAT, ULAN, TGN) are another case in point.  You have an economic model where users have been willing to pay to use a curated vocabulary in their data, and the publishers of that vocabulary protected their investment in it by limiting access.  All perfectly reasonable.
>
> Now these users want to publish their data as LD. Either they publish data from these vocabularies as strings (losing any LD benefit) or invent their own URL pattern (thereby creating a mini-silo).
>
> We need to persuade publishers of vocabularies in our sector that the advent of LD brings with it a responsibility to re-publish in LD format, so that their users can get LD value from all the investment those users have made in using that vocabulary.
>
> This doesn't need to mean that the publisher loses all control over their investment.  If each term/concept in a vocabulary is published as a separate "slash URL", it is unlikely that the whole vocabulary would be pirated from its LD representation.  Also, the RDF which is published doesn't need to contain every detail which is offered to paying
> customers: the key requirement is to have a published URL for each concept.
>
> There is a related issue to consider once vocabularies are available as LD, which is whether to use URLs which reflect the term in the vocabulary, or to go base them on the term's identifier within the vocabulary, i.e.:
>
> http://mygetty.org/aat/300011666
> or
> http://mygetty.org/aat/Alberene_stone
>
> I have found that there is a strong instinctive preference for "human-friendly" URLs, and of course this is what will typically be in users' data. However, it is arguable that they will actually be better served by the "meaningless" identifier (so long as it can easily be dereferenced, and the human-friendly info retrieved as required).
> Geonames is a good example where we manage to get along without "meaningful" URLs.  Another point for the report, maybe?
>
> Richard
> --
> Richard Light
>
>
Received on Wednesday, 30 March 2011 13:50:43 UTC