Vocabularies and datasets in LLD deliverables -- a proposal and call from Antoine Isaac on 2011-03-02 (public-xg-lld@w3.org from March 2011)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Wed, 02 Mar 2011 16:33:00 +0100
To: public-xg-lld <public-xg-lld@w3.org>
Message-ID: <4D6E632C.2000101@few.vu.nl>

Hi everyone,

On this:
ACTION: Antoine and jeff_ to make a proposal to the group about vocabularies and datasets [recorded in http://www.w3.org/2005/Incubator/lld/minutes/2011/02/24-lld-minutes.html#action08]

Here's what we gather from the discussion we had (freely pasting stuff from the minutes) on the vocabularies and datasets.

One big issue is that datasets are too volatile to have a stable reference list in the final report.
Yet there is value in having some representative sample, both in terms of contributing (or consuming) institutions and/or use cases.
Also, the group has been involved in setting up a CKAN-based infrastructure for gathering available datasets, which will still be relevant after the group's life--at least that's what we hope for and should work for. Ideally, it should be easy for the creators of any future LOD cloud--and therefore to the readers of such a picture--to find the "library zone" there.
Any inventory would greatly help promoting re-use of vocabularies and connection to datasets. These are of course key points of LD, and I think the LLD group should make that case for the library domain. Especially, we have already quite many metadata element sets in our list: re-using such vocabularies is a core priority, a snapshot of them should definitively bring a meaningful quick win. And producing a snapshot of the CKAN list for datasets with very small comments should be straightforward anyway.

Trying to address all this, our proposal would be:

==========
1. a section for the final report that:
- starts with a reference to the LOD cloud, saying that the section and the side deliverable will help a (library) reader facing that picture to answer questions like "what's in there for me" and "how may I contribute"? Both now (we have expertise on stuff the cloud does not capture, especially metadata element sets) and later: the cloud evolves and we can only produce a snapshot, but we do provide useful pointers.
- tries to gather representative vocs and datasets, starting from use cases. As [1] was created from the use cases, I suppose that it would be a sub-set of the elements there, as far as vocs are concerned.
- identify gaps, starting from use cases. If not redundant with the "problems and limitations" section, there could be a discussion on the organizational issues on LLD vocabulary and dataset publication and management (e.g., alignment), as penciled in [2] and the various topic lists created since.
- shortly presents the work in progress *at the time of the report*, which can solve the gaps. Trying to tell people what to expect in the near future.

This section would naturally lead to recommendations, but these would be part of another section of the report.

2. a separate "LLD Vocabularies and datasets" deliverable which:
- presents an organized and commented (one line per item) snapshot of the contents of [1] and [2]--still linking to use cases when applicable.
- introduces the CKAN LLD group [3] as a way to continue the vocabulary gathering effort for anyone interested, both information consumers and information publishers. Especially, after reading this section the publisher of a new datasets should be convinced that it's easy and worth it to advertise its set on CKAN.

All this should be a bit flexible of course. If during writing, we realize that some items from the section take too much space--especially the representative vocabularies and/or datasets--then we might put them in the side deliverable.

Also, a side piece of work to this is to make sure our LLD group is *really* connected to the official LOD cloud. Perhaps one of us could liaise with Richard and Anja to be sure they have all they need from the LLD group datasets, and ask them to produce a specific LLD sub-cloud. If this is just done by pushing a button that would be worth it. Perhaps even the code that they have to produce the cloud is freely accessible for us to use it...

==========

We hope this makes sense to you all.
Also, we would like to *call for contributors to this work*. We have volunteered for setting it up, but are willing to share it, especially where our experience is less obvious. We especially feel that Ross, William, Bernard and Marcia would be ideal candidates!
We are even willing to step down as current "owners", if one wants to get more involved :-)

Best,

Antoine and Jeff

[1] http://www.w3.org/2005/Incubator/lld/wiki/Vocabularies
[2] http://www.w3.org/2005/Incubator/lld/wiki/Vocabularies#Vocabulary_discussion_in_Pittsburgh
[3] http://ckan.net/group/lld

Received on Wednesday, 2 March 2011 15:32:35 UTC