[dxwg] How to express available formats for a dcat:Dataservice (#1055)

oystein-asnes has just created a new issue for https://github.com/w3c/dxwg:

== How to express available formats for a dcat:Dataservice ==
Sorry for coming in late in to this discussion, with the risk of missing out on obvious solutions and asking ignorant questions. We (Norway DCAT-AP-NO working group) are looking forward to be able to describe APIs in a DCAT-catalog, and the introduction of the dcat:DataService class seems to address that need neatly. Our biggest concern now is that users of a catalog of dcat:DataServices would expect to get hints on available formats for the cataloged APIs.

Based on user needs the information on formats is so important for the users of the data catalog that we placed it as one of the elements in the listings-page, see: https://fellesdatakatalog.brreg.no/apis

 In DCAT2 this need seems to be neglected, leaving the publishers with these three options:

1. Repeat dcat:Distribution for each format, populating (or polluting) the catalog with several dcat:Distributions for each dcat:DataService with dcat:format/dct:mediatype) (and possibly dcat:accessURL) as the only deviation. 

2. Exclude information on available formats for the dcat:Catalog, forcing the users to leave the data portal and follow either dcterms:conformsTo or dcat:endpointDescription (if the dcat:endpointDescription is an URL) to detect formats.

3. Use dct:description or dcat:endpointDescription (for dcat:DataService) and provide format-information as text only.  

Option 1)
We think this is tedious for providers/publishers and not very user friendly for the catalog end users. For dcat:DataServices providing content negotiating, it makes even less sense (to us), since the dcat:accessURL will be identical for each dcat:Distribution. This approach implies that all dcat:DataServices has at least one dcat:Distribution.

Option 2)
We are aware that information on formats is in the dcat:endpointDescription, but rarely in a way that makes sense for a RDF/linked data environment. This leaves the users with no filtering options and the catalog provider with no easy way of enriching the catalog service itself with information on formats for dcat:DataServices by harvesting from dcat:endpointDescriptions.  

Option 3)
Option 3 (and 2 combined) may be sufficient for some, but is not a very machine-readable approach. Filtering options will also be limited. 

**User stories:**
As a portal end user I would like to know in which formats a given dataservice provides data, so that I can get information on available formats without leaving the catalog and search and filter the datasets / dataservices  based on available formats  

As a catalog provider I would like to express available formats for a dcat:DataService without having to repeat new dcat:Distributions for each available format.

As a dcat:DataService provider I would like to provide information on available formats for APIs that do not distribute any datasets (e.g. a currency-conversion service / a CSV-to-DCAT-transformation service)  and therefore does not have relations to any dcat:Distribution

**Proposal:**
Add a “dcat:availableFormats” property to dcat:DataServices allowing publishers to list all data-serializations the service offers in a machine-readable way. 

PS: A humble thank you to all that have contributed to this work.

(Posted on behalf of the DCAT-AP-NO working group in Norway)


Please view or discuss this issue at https://github.com/w3c/dxwg/issues/1055 using your GitHub account

Received on Friday, 6 September 2019 07:36:18 UTC