Re: review of Vocabularies and Datasets Section from Antoine Isaac on 2011-06-09 (public-xg-lld@w3.org from June 2011)

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Thu, 09 Jun 2011 19:35:02 +0200
To: public-lld@w3.org, public-xg-lld <public-xg-lld@w3.org>
Message-ID: <4DF10446.60907@few.vu.nl>
Hi Ed,

[Sorry for the delay. LOD-LAM has kept us quite busy over the past days...]

Thanks a lot for the review, for the very kind words as well as the problem spotting. Your review is *very* useful!

On the first issue, Value Vocabulary vs. Datasets. That's indeed a hard one, and I'm still fighting against the original option I prefered, which was just to consider (and thus call) value vocabularies as "reference datasets", which main differentiating criterion would be to be re-used more widely than the rest of datasets, and to be designed accordingly...


Anyway, acknowledging the non exclusivity is a first answer. In fact it was in the original definitions (in the "confusions"), but we felt these were a bit long...
We have thus added a small note in the section:
http://www.w3.org/2005/Incubator/lld/wiki/index.php?title=Draft_Vocabularies_Datasets_Section&diff=5054&oldid=5053
Note that the section refers to the side deliverable, which refers to [1]. A curious reader could investigate this. The reference to the side deliverable is crucial, as it as examples of objects, which are in different categories. We hope that is enough!


On re-using the distinction between A-Box and T-Box: this is a seducing option, but we wonder whether it would be beneficial, in the end.
First, there is the inherent complexity of talking about A-Box and T-Box, as observed by Karen.
Second, there's still the issue of non-exclusivity: one first move is to consider metadata element sets to fall in the scope of T-Boxes, and value vocabularies and other datasets, in the scope of A-Boxes. We ourselves find this line reasonable, but we're afraid many people will find it debatable, when realizing that some of our beasts are both value vocabularies and metadata element sets.
Not that the formal theory does not take that into account, on the contrary--OWL 2's punning is dedicated to cases like that. It just brings us into depths that we may really not want to dive into...

Cheers,

The editors.

[1] http://www.w3.org/2001/sw/wiki/Library_terminology_informally_explained#Definitions



> I took an action [1] to do a quick-ish review the wiki drafts of the
> Vocabularies and Datasets ection bound for the final report, as well
> as the separate Vocabulary and Dataset deliverable.
>
> In general I think these two documents are really excellent, and are
> ready to be circulated more widely for comments. Indeed, if you are
> reading this and have some comments on the documents I think now would
> be a good time. The comprehensive overview of the vocabularies that
> draws on our case studies is very impressive--a lot of work must have
> gone into it. And the way that you summarized with the Observations
> section is very well done as well.
>
> While I understand the distinction between Element Set, Value
> Vocabulary and Dataset, I was a bit confused because both the Value
> Vocabulary and Dataset examples use authors:
>
> """
> VIAF defines authorities
> """
>
> and:
>
> """
> the same dataset may contain records for authors as first-class
> entities that are linked from their book, described with elements like
> "name" from FOAF
> """
>
> Is it the case that something like VIAF is both a value vocabulary and
> a dataset? Is it worth adding a sentence about how the categories are
> not mutually exclusive? Or perhaps we should not talk about Datasets
> at all? Also, did we decide not to ground our definition in terms of
> TBOX and ABOX?
>
> In the Linking section, does it make sense to mention VIAF as a good
> example of a library project that creates links between library
> resources? I think the cultural heritage sector needs to be encouraged
> to share more information (in the form of articles, blog posts, etc)
> about linking strategies, such as what OCLC have used to link VIAF
> resources to Wikipedia, or Open Library's efforts to link to
> worldcat.org. Also I think this section would be a good place to
> highlight services such as Google Refines Reconciliation Service [4]
> and the LOD2's Silk Framework. It would be good if the section
> emphasized the need for our community to gain experience using them,
> sharing linking results, and building more tools that are suited to
> our environment.
>
> I also have a few comments about the separate Vocabulary and Datasets
> deliverable:
>
> I see Crossref's DOI mentioned in the auto-generated graph, but should
> we mention CrossRef's DOI service explicitly? [5]. It is a big
> development for linked data for scholarly research.
>
> Another recent development is that the Archipel project (mentioned in
> the report) have published a PREMIS vocabulary [6] which is
> significant for the digital preservation community. I don't know if
> this will lead to something more formal from the PREMIS folks
> themselves, but it is a good sign of things to come.
>
> Should we include the LOCAH projects RDF vocabulary for archival
> information [7]? I know that LOCAH are mentioned in the EAD section,
> but Pete Johnston (one of the key folks behind DublinCore)&  co have
> spent a bit of time thinking about how to model archival data in RDF.
> Also, Aaron Rubinestein has a lightweight vocabulary for expressing
> Archival information which he calls Arch [8].
>
> Really nice work!
> //Ed
>
> [1] http://www.w3.org/2005/Incubator/lld/minutes/2011/05/19-lld-minutes.html#action08
> [2] http://www.w3.org/2005/Incubator/lld/wiki/Draft_Vocabularies_Datasets_Section
> [3] http://www.w3.org/2005/Incubator/lld/wiki/Vocabulary_and_Dataset
> [4] http://code.google.com/p/google-refine/wiki/ReconciliationServiceApi
> [5] http://lod2.eu/Project/Silk.html
> [5] http://www.crossref.org/CrossTech/2011/04/content_negotiation_for_crossr.html
> [6] http://multimedialab.elis.ugent.be/ontologies/PREMIS2.0/v1.0/premis.owl
> [7] http://data.archiveshub.ac.uk/def/
> [8] http://purl.org/archival/vocab/arch
>
Received on Thursday, 9 June 2011 17:32:38 UTC