Re: LLD@CKAN from Antoine Isaac on 2011-05-04 (public-lld@w3.org from May 2011)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Wed, 04 May 2011 11:03:10 +0200
To: public-lld <public-lld@w3.org>
Message-ID: <4DC1164E.9080507@few.vu.nl>
Hi Tom,


> On Thu, Apr 28, 2011 at 4:33 AM, Antoine Isaac<aisaac@few.vu.nl>  wrote:
>
>> More and more datasets relevant to our community are being released as
>> linked-data. For example, Ed has recently published a great blog post which
>> points at a couple of important, if not crucial ones [1].
>>
>> As some of you know already, our group has started a CKAN group on library
>> linked data [2]. The hope is to help anyone find relevant datasets and
>> accompanying descriptions, which may help them find their way into the
>> growing LLD space.
>>
>> We found CKAN to be a really simple tool to use--see the instruction for
>> "contributing" a dataset at [3]. It's also really fit for the purpose: the
>> description any LLD dataset on CKAN can be pulled for updating the LOD cloud
>> [4] (if tagged with "lod"), or for any more specific purpose [5].
>>
>> We will try to highlight the value of such a dataset inventory in our
>> deliverables. But it is crucial that the community also starts using it now,
>> for the benefits of all--newcomers and experts alike. We can't do all the
>> work, especially if our group ceases to exist as such soon.
>>
>> And remember, no one needs to "own" a dataset to publish a description for
>> it! CKAN is really open to anyone.
>
> CKAN is a great tool.  As I understand it, CKAN "groups" are curated,
> as opposed to "tags" which anyone can add.
>
> What makes a data set at CKAN an LLD data set?  Is it the source (ie
> it comes from a brick&  mortar library), the vocabulary used, the
> percentage of "library" data as opposed to other data (thus
> disqualifying Freebase or DBpedia) or some other criteria?


Well, the selection process is based on the intuition of the Group's administrators [1]. Note that everyone can become an administrator, by asking the current ones, as described at [2].

Also, we don't have specific criteria for selecting datasets out. It must be very relevant for brick and mortar libraries, but that does not mean that it should be hosted by them, for example. Especially, I don't see why we would exclude Freebase, as it stands now:
- it has a huge amount of book descriptions in it (perhaps more than many "pure" library linked data services would have)
- they come from a library environment
- if Owen uses it as an example of bibliographic data, there's no reason we would not :-)

So I've just added in. It would be better though if CKAN allowed us to create some "group-specific annotations" that would give more explanation on why it is relevant for LLD. The stats of your mail at [3] would have been perfect!


> One of the things that it seems like will become apparent over time is
> that there is really very little pure "library" data. For example, the
> vast majority of books aren't written by some abstract writerly
> author, but by an astronomer or politician or cook or architect who
> also writes on occasion.
>
> Libraries need to be looking outward for this type of information to
> link to, not inwards.


We are well aware of this. There were very long threads about the issue the past months on this list. And a lot discussion on whether native library authority data would be modeled as real-world resources or via "indirections" such as SKOS concepts, and on how to do it.
But I think there was a general agreement on the fact that any solution should be open and try to link/interoperate with other bodies of knowledge on the things refered to in library catalogues.

Best,

Antoine

[1] http://ckan.net/group/lld
[2] http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkedLibraryData/Datasets/CKANmetainformation
[3] http://lists.w3.org/Archives/Public/public-lld/2011May/0018.html
Received on Wednesday, 4 May 2011 09:01:18 UTC