Re: [dxwg] Question: how to catalog relational database data in DCAT? (#1240)

Thanks for bringing to our attention this use case, @ds-merck . I give below a preliminary answer, which may be complemented by other WG members.

I think there are two different aspects here: one is how to use DCAT to document a database, the other is how to publish it. The latter is not strictly in scope with DCAT, but rather with existing guidelines and best practices - as the Data on the Web BPs (https://www.w3.org/TR/dwbp/), which I think may provide some useful hints for your use case.

About DCAT, the issue is the appropriate use of `dcat:accessURL`. Probably this is not made enough explicit in DCAT, but `dcat:accessURL` is supposed to point to a URL which can be followed to land in a "place" whence users can access the data. Typically, this means HTTP URLs, but other protocols are not excluded a priori, provided that they can be used as actionable links landing somewhere - e.g., the URL of an FTP folder.

In terms of how to address this in practice, a number of options are available, following under the scope of data publication best practices. I outline below some of them, just as an example, describing also how to use DCAT:

1. Irrespective of whether the database is publicly accessible or not, the access URL can point to a Web page providing instructions on how to access the database. This may include instructions to access the whole database or also single tables (e.g., by supplying the relevant queries).
2. Additionally, a dump of the database (and/or specific tables) could be made available for download. The link to the database dump (or its subsets) can be included in the Web page above, and also specified with property `dcat:downloadURL` (https://www.w3.org/TR/vocab-dcat-2/#Property:distribution_download_url). On this option, see DWP17 (https://www.w3.org/TR/dwbp/#BulkAccess)
3. Finally, the database (and/or specific tables) can be made accessible via a (Web) API, which can be described as a `dcat:DataService` (https://www.w3.org/TR/vocab-dcat-2/#Class:Data_Service), and linked from the distribution with property `dcat:accessService` (https://www.w3.org/TR/vocab-dcat-2/#Property:distribution_access_service). On this option, see DWBP23 (https://www.w3.org/TR/dwbp/#useanAPI)

BTW, about your requirements of describing how to give access to subsets of your database, possibly in a machine-actionable way, this issue is being discussed in DCAT in relation to data available from a service / API - i.e., point (3) above.

Does this answer your question, @ds-merck ? 

-- 
GitHub Notification of comment by andrea-perego
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/1240#issuecomment-643827278 using your GitHub account

Received on Sunday, 14 June 2020 21:54:36 UTC