[dxwg] Distinguishing between stand alone and tightly coupled data services (#1434) from matthiaspalmer via GitHub on 2021-12-08 (public-dxwg-wg@w3.org from December 2021)

From: matthiaspalmer via GitHub <sysbot+gh@w3.org>
Date: Wed, 08 Dec 2021 19:57:24 +0000
To: public-dxwg-wg@w3.org
Message-ID: <issues.opened-1074779893-1638993443-sysbot+gh@w3.org>

matthiaspalmer has just created a new issue for https://github.com/w3c/dxwg:

== Distinguishing between stand alone and tightly coupled data services ==
Should there be a way to indicate to data portals that certain data services are tightly coupled to datasets?
The value would be to allow data portals to filter out such tightly coupled data services from searches as they provide little value when viewed independently.

I can imagine at least three cases of how the relations to datasets look like:
A. One to one - There is a tight tight coupling between a dataset and a dataservice. The dataset has a distribution that points to the dataservice via the dcat:accessService. Inversely the data service may point back to the dataset via the dcat:servesDataset, but not to any other dataset.

B. One to many - One data service can leverage data for several datasets, e.g. by providing parameters when accessing the service different amounts of data is returned and these data belong to different datasets. This is expressed by several datasets having distributions that point to the same data service. Inversely the data service may point back to all the dataset it serves via the dcat:servesDataset.

C. Independent - A data service has no connection to a dataset. There can be several reasons for this, e.g. it is a data transformation service (think of a currency converter).

Different data portals may and will of course handle this differently. In the [Swedish Data portal](https://www.dataportal.se/en) it has been deemed that data services that falls under scenario A provide no extra value to show in the search. This is due the fact that they are very similar to the connected dataset and would in many cases look like duplicates. Note that the Swedish data portal have chosen to have a common search against "Data and APIs" as most people won't know or care about the difference from a search perspective, they just want to find the data they are looking for independently of how it can be accessed. On the other hand, data services corresponding to scenario B and C are described in more detail and are shown as search results independently.

To accomplish this Swedish data providers have been encourage to do one of two things to mark a data service as tightly coupled:

1. Exclude the dcat:service relation from the catalog, or
2. Don't provide a dcterms:publisher

Note that a data service that is not pointed to directly from the catalog can still be considered to be part of the catalog based on either being reachable via a distribution or by being part of a certain RDF graph (shipped in the same file).

I think it would be useful to have a usage note about this. If not deemed suitable for DCAT 3, I think at least it would be good to make sure that there is nothing in the specification that hinders profile developers from expressing this without being in conflict (I think it is ok as the specification is written today, but it would be appreciated with more eyes on this).

Please view or discuss this issue at https://github.com/w3c/dxwg/issues/1434 using your GitHub account

--
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 8 December 2021 19:57:26 UTC