Re: DPCat: Data Processing Catalogue using DPV and DCAT(-AP)

Hi. I understand your argument about ROPA and catalogues in general, and 
hopefully the replies below are sufficient to clarify the usefulness of 
combining the two.

On 19/04/2022 15:01, Bert Van Nuffelen wrote:
> I am interested in this motivation:
> 
>  >
>  > In particular, the choice of DCAT-AP was made to present a mechanism 
> for sharing ROPA related information using an EU-advocated standard and 
> to promote the possibility of reusing existing data portal 
> infrastructures for compliance-related purposes - such as requirements 
> for ROPA between controllers, processors, and DPAs.
>  >
> 
> When reading what is ROPA about I would it expect more Prov-O reuse 
> rather than DCAT. And if DCAT(-AP) is targeted as reference model then I 
> do not see the aggregation of a ROPA catalogue and an Open Data Portal 
> (like data.europa.eu) would make sense out of the box.
> Can you elaborate if this is the intend of that motivation?

Here, ROPA (the document) is a catalog of information. Therefore, the 
use of DCAT is suitable, which makes DCAT-aware tools and services also 
suitable for GDPR compliance documentation purposes - which is what the 
ROPA is intended to be. PROV may be relevant if one were to represent 
the processing operations as Activities, but that is not what a ROPA is 
supposed to be, so it wasn't used here.

> 
> Suppose the objective is to create a DCAT(-AP) profile, then it must be 
> clear that the class ROPA is a  subclass of DCAT-AP Dataset, not just 
> only in formal statement, but also in the definitional and usage note 
> semantics. Thus it should be valid to replace the definition of ROPA 
> with the definition of DCAT-AP Dataset and still create somehow an 
> understandable story.
> That brings me to the aggregation process I mentioned above: A dataset 
> in DCAT-AP is intented as a collection of data that is being shared 
> (either in the form of a file distribution either via a data service). 
> DCAT is thus the metadata about the data that is being shared, not the 
> data itself.
> 
> So if a ROPA is metadata about a GDPR context for a dataset then why not 
> adding a single class ROPARegistration with a relationship for a 
> dcat:dataset hasROPAentry?

A ROPA is just one part of the vast compliance related documentation, 
and can have corresponding documents or processes internal to the 
organisation. This is what it means to say ROPA is a dcat:Dataset (or 
more correctly a dcat:Resource). An organisation may want to have 
spreadsheets, or PDF, or plain text files, of whatever as the actual data.

The ROPA in DPCat is a 'view' or 'metadata' that provides interoperable 
abstraction over all these implementation details. This permits the 
organisation to keep its information in whatever structure and format 
and location it wishes to, while the ROPA-related fields provide data 
governance capabilities for tools and services that operate on ROPA 
through those metadata fields.

If there were no such (GDPR related) fields, then the ROPA as a catalog 
would just be a small wrapper around a document detailing who published 
it and when. That isn't much useful to anyone. The actual value of 
putting these fields is that it enables a lot of smart queries and data 
management tasks to be carried out by DPOs, managers, auditors, 
investigators, etc.

> As an additional suggestion, for me to better understand, the definition 
> of the class ROPA could be improved. It is self-referencing, in short it 
> states: a ROPA is a ROPA. That does not aid me.  Certainly for the 
> critical classes and relationships this aids a lot.

Thank you, I've made note of this and will change it in the next update.

Regards,
-- 
---
Harshvardhan J. Pandit, Ph.D
Research Fellow
ADAPT Centre, Trinity College Dublin
https://harshp.com/

Received on Tuesday, 19 April 2022 14:19:35 UTC