RE: W3C Data Exchange Working Group: Invitation to review draft revision of DCAT

To ease tracking, note comments raised by Clemens led to the creation of two issues in the GitHub repository of the Dataset Exchange Working Group:
 
Data catalogues should not treat services as first class citizens:
https://github.com/w3c/dxwg/issues/530
 
Profiles and distributions:
https://github.com/w3c/dxwg/issues/531
 
 
From: Clemens Portele <portele@interactive-instruments.de> 
Sent: Wednesday, October 24, 2018 5:18 AM
To: public-sdwig@w3.org
Subject: Fwd: W3C Data Exchange Working Group: Invitation to review draft revision of DCAT
 
Forgot to cc the SDWIG...



Begin forwarded message:
 
From: Clemens Portele <portele@interactive-instruments.de <mailto:portele@interactive-instruments.de> >
Subject: Re: W3C Data Exchange Working Group: Invitation to review draft revision of DCAT
Date: 24. October 2018 at 04:59:46 CEST
To: "public-dxwg-comments@w3.org <mailto:public-dxwg-comments@w3.org> " <public-dxwg-comments@w3.org <mailto:public-dxwg-comments@w3.org> >
Resent-From: <public-dxwg-comments@w3.org <mailto:public-dxwg-comments@w3.org> >
 
Dear DXWG,
 
thank you for your invitation to comment. I want to express concerns about the direction of DCAT with respect to distributions and services. The rationale for making services first class citizens in data catalogs is not clear to me. The model of DCAT 1.0 is in this area, in my view, clearer and more useful from an end user perspective. The items registered in a *data* catalog should be restricted to data (that is: datasets - unless you register individual items). As a user I go to a data catalog looking for datasets, not services; once I have found some candidate datasets then I want to understand how to access the data, whether that is a file download, through webpages or an API. This relationship is described in the Data on the Web Best Practices (https://www.w3.org/TR/dwbp/#context), too. 
 
DCAT 1.0 does not provide sufficient detail or guidance for documenting the different kinds of distributions. A downloadable file is straightforward, but an API could benefit from more information than what DCAT 1.0 supports. From the UCR document I expected that the reversion of DCAT would improve Distribution instead of adding Services as additional, separate resources in the mix. The separation between Distribution and Service looks blurred to me, with overlap and redundancies in their properties. Please reconsider this design decision.
 
In my view, the current direction is repeating some of the mistakes that the spatial data community made in their standards.
 
I have a few additional questions that are related to this and also related to the topic of profiles:
 
Say, I have a dataset of buildings and it is made accessible according to two different profiles (e.g. two different XML schemas or two different JSON schemas). The two profiles use different vocabularies and there are differences in the content. However, both representations are sourced from the same data. To me this would be a single dataset. However, this is not that clear in DCAT 1.0 and one could also take the view that these are two different datasets - with separate dataset metadata. At least I know cases where this has been represented as two datasets in catalogs. The new DCAT draft adds language about dataset as "a single conceptual entity" which seems to support the view that there is a single dataset in this case. Could guidance be included in the revision to support more consistent implementations, maybe just an example for such a case?
 
Assuming this would be consondered one dataset: If both profiles would be served through the same API (or service) and profile negotiation would be used, would this be one distribution (since it is a single API) or two distributions (one per profile, but with the same accessURL)? 
 
Currently you can only specify the media type of a distribution. Considering the work on profiles and profile negotiation in the DXWG wouldn’t it make sense to be able to specify the profile(s) that a distribution supports in DCAT?
 
Thank you for your consideration,
best regards,
Clemens
 
-- 
Clemens Portele
portele@interactive-instruments.de <mailto:portele@interactive-instruments.de> 
+49 228 9141073 (office)
+49 151 15298497 (mobile)
cportele (skype)

interactive instruments Gesellschaft für Software-Entwicklung mbH
Trierer Str. 70-72, 53115 Bonn, Germany
Geschäftsführer: Remigius Koblenzer, Clemens Portele
Amtsgericht Bonn, HRB 3872
 
 

Received on Monday, 5 November 2018 15:20:14 UTC