Re: [dxwg] authenticity and integrity of dcat files and associated datasets (#1526)

In addendum, I would add the effect of persistent URIs.

Suppose I have found in a portal, e.g. data.europa.eu a dataset (`https://data.europa.eu/data/datasets/https-katalog-riksarkivet-se-store-1-resource-106?locale=en`) and this has as PURI `https://katalog.riksarkivet.se/store/1/resource/106`. Then by dereferencing the PURI the orginal dataset description is found.

So if one does not trusted the harvesting portal, then one could via this mechanism find the source portal and the original metadata.

Now, one could argue that one does not trust the response of the HTTP dereferencing, which in the end comes down, I do not trust the source. 
The use of persistent dereferenceable identifiers is actually a simple yet powerful method to guarantee that the data is trustworthy.

Note that DCAT is about metadata descriptions. It describes the rules of the use of the data that it metadata wise describes. Thus, the issue of trust is actually way more complex than having the "original metadata descriptions". Suppose one uses the data that is found via a DCAT metadata description via a super secure data ecosystem for a data processing task that is infringing the legislation, then the super secure data ecosystem for DCAT will not be an argument that one could perform the data processing. This is all because DCAT does not provide data, but the means to access the data. And thus the trust/legal responsibility will be transferred to the data provider/data processor (in GDPR terminology) that is providing the access to the real data.


Also, I want to note that DCAT does not mean sharing the data in RDF format. I hope we agree as community that DCAT can be implemented in many technical ways, as long the semantics are preserved. 
I agree though that the most natural, and for conformance reasons, it should be possible to unambiguously transform the implemented data structured in a RDF structure. 
(This holds in my opinion for all domain vocabularies and application profiles.)

ps. on the checksum: that is about the file the Distribution is pointing to, not about the metadata (the dcat:Distribution).
  



-- 
GitHub Notification of comment by bertvannuffelen
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/1526#issuecomment-1625644317 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Friday, 7 July 2023 16:15:31 UTC