[dxwg] How to Catalog Data Duplication Settings per Dataset (#1589)

Jonessmj has just created a new issue for https://github.com/w3c/dxwg:

== How to Catalog Data Duplication Settings per Dataset ==
Dear DCAT team,

I have a question on how to properly use DCAT to capture metadata about instructions for copying a dataset. The instructions/configurations are per dataset. The scenario is that there is a dataset in a data catalog that could be copied to an AWS Redshift Cluster. It hasn't been copied yet, but if certain application-level things happen then a service will copy the data to one or many AWS Redshift Clusters. Prior to this happening though, the owner of the dataset will specify default DIST and SORT configurations to be used for the duplicated dataset.

Since these parameters/configurations are being set per source dataset and the duplicated dataset doesn't exist when these parameters/configurations are being defined I was thinking that it should be a property of the source dataset, but I'm not sure what dcat terms or extensions of dcat I should use. Alternatively, should these settings be some first class entity of their own with a prov relationship to the source dataset?

Please view or discuss this issue at https://github.com/w3c/dxwg/issues/1589 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 7 February 2024 13:59:42 UTC