Re: Comments on DCAT: vocabulary scope

Hello,

Thanks for the feedback and sorry it took me that long to reply.
Based on the feedback I received from you and others, I have updated the document with a definition of dcat dataset and a brief description of how APIs (and services) can be described in DCAT.

Some further replies inline...

On 7 Apr 2013, at 21:34, Andrea Perego <andrea.perego@jrc.ec.europa.eu> wrote:

> I would like to ask the GLD WG a clarification concerning the actual scope of DCAT. 
> 
> Basically, the question concerns whether DCAT can be used to describe 
> 1. catalogues of a specific type of information resources, namely, datasets. or
> 2. catalogues of any type of information resources (e.g., datasets, documents, data models, vocabularies, thesauri, code lists, audio and video files, software, services).
> 
DCAT is meant to describe datasets. DCAT's definition of dataset is:
"a collection of data, published or curated by a single agent, and available for access or download in one or more formats"

Giving this definition, most of the examples in the second item can be considered as datasets. This includes data models, vocabularies, thesauri, code lists, but not software, audio, etc...

> I have always thought that the right option was the former one, and this seems to be confirmed by the definitions and terminology used in the DCAT spec. However, I had some concerns when I realised that ADMS has been recently defined as a DCAT profile for semantic assets [1], Based on this, I wondered whether the DCAT notion of "dataset" was broader than the "common" one.
> 

Semantic assets in ADMS are a particular type of datasets. ADMS is defined as a DCAT profile in order to provide more specific properties that applies only to semantic assets and in order to impose further requirements on properties and their ranges which are not part of DCAT.

> My question is also about how DCAT can be actually used in existing catalogues, e.g., those providing access to government resources. Although the majority of them are just about datasets, several examples are available of gov portals of other types of information resources (e.g., software re-usable by Public Administrations) or even of heterogeneous types of information resources. An example of the latter is the INSPIRE Geoportal [2], which provides a single access point for geospatial datasets, dataset series, and services of EU Member States.
> 

DCAT cannot describe software per se. However, APIs that provide access to data can be described in DCAT (as an instance of dcat:Distribution). I added an explicit paragraph to the vocabulary overview stating this.
 https://dvcs.w3.org/hg/gld/raw-file/default/dcat/index.html#overview

> In both the cases above, if DCAT is just about datasets, it could not be used to describe the catalogue, the totality of the resources it gives access to, and the corresponding distributions. Actually, in INSPIRE, DCAT could be used to describe datasets, maybe also dataset series, and their distributions, but neither services nor their distributions, and not the catalogue itself.
> 
> As a consequence, specific vocabularies should be defined to denote catalogues and distributions of resource types different from datasets. And this would not help interoperability.
> 
> So, I wonder whether the GLD WG would consider making dcat:Catalog and dcat:Distribution more generic, namely, a catalog / distribution of any type of information resources, and not just of datasets.
> 

IMHO, this is out the scope of DCAT. While this means that DCAT can't describe software and some other types of information resources, I believe trying to address the very general notion of information resource will affect the focused scope, minimality and ease of use.

Best regards,
Fadi

> If this is already foreseen, i.e., if DCAT is for catalogues of any type of information resources, I would suggest making this explicit in the spec. It would be also useful to have an additional class, denoting information resources available in a catalogue, and to define dcat:Dataset a subclass of it.
> 
> Thanks in advance.
> 
> Andrea
> 
> ----
> [1]https://dvcs.w3.org/hg/gld/raw-file/default/adms/index.html
> [2]http://inspire-geoportal.ec.europa.eu/
> 
> -- 
> Andrea Perego, Ph.D.
> European Commission DG JRC
> Institute for Environment & Sustainability
> Unit H06 - Digital Earth & Reference Data
> Via E. Fermi, 2749 - TP 262
> 21027 Ispra VA, Italy
> 
> DE+RD Unit: http://ies.jrc.ec.europa.eu/DE
> 
> ----
> The views expressed are purely those of the writer and may
> not in any circumstances be regarded as stating an official
> position of the European Commission.

Received on Tuesday, 25 June 2013 08:23:38 UTC