Re: [sdw] Evaluate DCAT changes in terms of impacts to best practices from Michael Gordon via GitHub on 2018-10-24 (public-sdwig@w3.org from October 2018)

From: Michael Gordon via GitHub <sysbot+gh@w3.org>
Date: Wed, 24 Oct 2018 11:57:41 +0000
To: public-sdwig@w3.org
Message-ID: <issue_comment.created-432626129-1540382260-sysbot+gh@w3.org>

Comments from @cportele to public DXWG mailing list:

"I want to express concerns about the direction of DCAT with respect to distributions and services. The rationale for making services first class citizens in data catalogs is not clear to me. The model of DCAT 1.0 is in this area, in my view, clearer and more useful from an end user perspective. The items registered in a *data* catalog should be restricted to data (that is: datasets - unless you register individual items). As a user I go to a data catalog looking for datasets, not services; once I have found some candidate datasets then I want to understand how to access the data, whether that is a file download, through webpages or an API. This relationship is described in the Data on the Web Best Practices (https://www.w3.org/TR/dwbp/#context), too.

DCAT 1.0 does not provide sufficient detail or guidance for documenting the different kinds of distributions. A downloadable file is straightforward, but an API could benefit from more information than what DCAT 1.0 supports. From the UCR document I expected that the reversion of DCAT would improve Distribution instead of adding Services as additional, separate resources in the mix. The separation between Distribution and Service looks blurred to me, with overlap and redundancies in their properties. Please reconsider this design decision.

In my view, the current direction is repeating some of the mistakes that the spatial data community made in their standards.

I have a few additional questions that are related to this and also related to the topic of profiles:

Say, I have a dataset of buildings and it is made accessible according to two different profiles (e.g. two different XML schemas or two different JSON schemas). The two profiles use different vocabularies and there are differences in the content. However, both representations are sourced from the same data. To me this would be a single dataset. However, this is not that clear in DCAT 1.0 and one could also take the view that these are two different datasets - with separate dataset metadata. At least I know cases where this has been represented as two datasets in catalogs. The new DCAT draft adds language about dataset as "a single conceptual entity" which seems to support the view that there is a single dataset in this case. Could guidance be included in the revision to support more consistent implementations, maybe just an example for such a case?

Assuming this would be consondered one dataset: If both profiles would be served through the same API (or service) and profile negotiation would be used, would this be one distribution (since it is a single API) or two distributions (one per profile, but with the same accessURL)?

Currently you can only specify the media type of a distribution. Considering the work on profiles and profile negotiation in the DXWG wouldn’t it make sense to be able to specify the profile(s) that a distribution supports in DCAT?"

--
GitHub Notification of comment by MichaelGordon
Please view or discuss this issue at https://github.com/w3c/sdw/issues/1084#issuecomment-432626129 using your GitHub account

Received on Wednesday, 24 October 2018 11:57:43 UTC