Re: DPCat: Data Processing Catalogue using DPV and DCAT(-AP)

Hi,

I am interested in this motivation:

>
> In particular, the choice of DCAT-AP was made to present a mechanism for sharing ROPA related information using an EU-advocated standard and to promote the possibility of reusing existing data portal infrastructures for compliance-related purposes - such as requirements for ROPA between controllers, processors, and DPAs.
>

When reading what is ROPA about I would it expect more Prov-O reuse rather than DCAT. And if DCAT(-AP) is targeted as reference model then I do not see the aggregation of a ROPA catalogue and an Open Data Portal (like data.europa.eu) would make sense out of the box.
Can you elaborate if this is the intend of that motivation?

Suppose the objective is to create a DCAT(-AP) profile, then it must be clear that the class ROPA is a  subclass of DCAT-AP Dataset, not just only in formal statement, but also in the definitional and usage note semantics. Thus it should be valid to replace the definition of ROPA with the definition of DCAT-AP Dataset and still create somehow an understandable story.
That brings me to the aggregation process I mentioned above: A dataset in DCAT-AP is intented as a collection of data that is being shared (either in the form of a file distribution either via a data service). DCAT is thus the metadata about the data that is being shared, not the data itself.

So if a ROPA is metadata about a GDPR context for a dataset then why not adding a single class ROPARegistration with a relationship for a dcat:dataset hasROPAentry?

Dataset -> hasROPARegistry -> ROPARegistration.

In the way the profile is being build, the ROPA registration informtion for  a dataset
http://data.europa.eu/88u/dataset/planning-applications5<https://data.europa.eu/data/datasets/http:/data.europa.eu/88u/dataset/planning-applications5?locale=en>

is added with a lot of properties on top of the existing metadata.

This approach can create registration management issues. The metadata (e.g. title) of a dataset is evolving, but I assume that the registration of a ROPA might be subject to very strict procedures and that a title change might be very costly. Thus by overlaying the same dataset entity with additional profile interpretations is creating possible data management issues. If that is the case, I personally would not put identification metadata in the ROPA, but keep it outside associated with the dataset. And enforce the use of Persistent Identifiers for the datasets to make the ROPA registration strict. One could even consider to reverse  the relationship from ROPARegistration -> Dataset .

It seems as if it DPCat has assumes that a ROPA is a dataset in data.europa.eu. But I am not so sure, for me it looks like a different thing, with a different register structure and expectations than the datasets in data.europa.eu. Related but different. (*)

 As an additional suggestion, for me to better understand, the definition of the class ROPA could be improved. It is self-referencing, in short it states: a ROPA is a ROPA. That does not aid me.  Certainly for the critical classes and relationships this aids a lot.

(*) Note that catalogue structure is not always a good basis to choose DCAT as vocabulary. Governments have registers for everything: Cars, people, organisations, etc. If we would apply the DCAT catalogue structure on all this, the only vocabulary needed is DCAT. But very helpful it won't be.


kr,

Bert
________________________________
Van: Harshvardhan J. Pandit <harshvardhan.pandit@adaptcentre.ie>
Verzonden: dinsdag 19 april 2022 13:02
Aan: public-dpvcg@w3.org <public-dpvcg@w3.org>
CC: Paul Ryan <paul.ryan76@mail.dcu.ie>; Georg Philip Krog <georg@signatu.com>
Onderwerp: DPCat: Data Processing Catalogue using DPV and DCAT(-AP)

Hi.
Paul, Rob, and Myself are sharing our work on 'DPCat' which is a
specification for ROPA governance by using DPV and DCAT(-AP). In this,
we create a machine-readable and interoperable specification for sharing
ROPA information based on collected requirements from all of EU DPAs
ROPA templates. In this, DCAT is used to "package" the information and
DPV is used to represent GDPR-related information.

Specification: https://w3id.org/dpcat

A draft paper explaining the motivation and creation of this along with
use-cases and application over EDPS ROPA documents:
https://doi.org/10.5281/zenodo.6448787

Learning experiences for DPVCG: It is unclear how the DPV concepts
should be used in a consistenct and interoperable manner. For this, we
should discuss identifying use-cases and creating suggested "shapes"
that specify how information should be structured for that use-case.

Regards,
--
---
Harshvardhan J. Pandit, Ph.D
Research Fellow
ADAPT Centre, Trinity College Dublin
https://harshp.com/

Received on Tuesday, 19 April 2022 14:01:40 UTC