- From: Harshvardhan J. Pandit <harshvardhan.pandit@adaptcentre.ie>
- Date: Wed, 5 Jul 2023 08:35:10 +0100
- To: "public-dpvcg@w3.org" <public-dpvcg@w3.org>
- Cc: "beatriz.gesteves" <beatriz.gesteves@upm.es>
Hi. This email sets out the impact of changing core concepts and relations within the DPV, currently being discussed in the context of integrating the Data Governance Act (DGA). Given below are 4 options for how DPV can represent non-personal data and their impacts and adoption considerations. Please indicate your preference or objections by replying to this email or at https://github.com/w3c/dpv/issues/99. Decisions will only be taken in the meeting calls. --- # Summary of Discussion 1) Current structure of DPV - The 'core' concepts in DPV all relate to personal data. For example, Purpose is defined as the purpose for processing of personal data, Processing is defined as the operation on personal data, and PersonalDataHandling is defined as the 'handling' or process regarding processing of personal data. Through this, DPV can express information about how personal data is being used within an use-case. 2) Limitations - Since all concepts are regarding personal data, they cannot be used where other non-personal data is involved. For example, Technical Measures such as encryption are applicable for non-personal data, but the DPV concept is defined for its use over personal data. Similarly, Processing and Purpose are also generic terms that apply to both personal and non-personal data, but in DPV we have defined them as being only about personal data. 3) DGA's scope involves both Personal and Non-Personal Data - To simplify the Act, it sets up portals where datasets can be found and reused. If the data is personal, then GDPR applies and mechanisms such as consent and pseudonymisation can be involved. If the data is non-personal, licenses and copyright can be involved. A commonality between both is describing the purposes of processing that data e.g. what the consent or license permits or limits, or how the data must be processed e.g. storage conditions such as location or temporal limitation or technical measures such as access control and encryption. 4) Required changes in DPV for DGA - The personal data related concepts are well established within DPV and not much needs to change other than considering some new types of entities and measures. The non-personal data concepts are completely absent. To be able to model the DGA (and other initiatives like it) - DPV would need to have concepts that can address both personal and non-personal data. This represents a significant expansion of scope in terms of DPVCG. --- Question 1: Should the scope of DPV be made broader to encompass personal as well as non-personal data, with the focus remaining on responsible use of personal data? - This decision must be determined by the group. The only advantage for including non-personal data as we have discussed so far is related to DGA. - We have had people who are interested in doing this work, and so far I have not registered any objections. --- Question 2: Assuming the answer to Q1 is Yes, what options are available to add non-personal data concepts and what are its implications? Option 1: we change the core properties of DPV to represent both personal and non-personal data i.e. Purpose becomes "purpose for processing of 'data'" and Processing becomes "operations on 'data'" rather than 'personal data'. Personal Data will have parent 'Data' and sibling 'NonPersonalData' concepts. Legal Basis will be distinguished as 'Legal Basis for Personal Data' and 'Legal Basis for Non-Personal Data'. The relations, e.g. hasPurpose, will also change accordingly. - the implications of these are that the change in concepts means anything that is using these will be impacted e.g. existing adopters and use-cases will see their work being changed - to enable choice and control over such major changes, the version number should be increased to 2 and a separate namespace/URI e.g. w3id.org/dpv/v2 - this is the best choice in terms of simplicity of information modelling as it keeps the total concepts lower by reusing the same concept (e.g. Purpose) for personal and non-personal data. - Where necessary, existing concepts will be split into variants for Personal and Non-Personal Data. E.g. Legal Basis as above, Technical and Organisational Measures where relevant - e.g. encryption is applicable to both but anonymisation only applies to personal data Option 2: we do not change anything in the current set of concepts, and instead create a separate set of concepts for non-personal data - similar to an extension. E.g. non-pd:Purpose would be the purpose for processing non-personal data, non-pd:LegalBasis would be the legal basis for non-personal data, and so on. - this option does not impact any existing adoption or use-case for DPV as no concepts in DPV are being changed, and hence there is no change in namespace/URI - this is not a good design choice in terms of information modelling as it duplicates the concepts for each of personal and non-personal data - however this can be justified with the above reason for not impacting existing users as well as there being significant different in concepts to have them defined separately - this is not 'attractive' to use because the concepts are separated in two sets, which means the users cannot just say 'Purpose' but will have to specify whether it is from the 'Personal' or 'Non-Personal' vocabularies. - this also means each concept may need to be duplicated across personal and non-personal variants e.g. encryption will have to be defined twice. Option 3: we do not change anything, and discard the proposal Option 4: redefine DPV to "Generic Data Processing Vocabulary" which is about any data so that there is no continuity in terms of concepts. This means we redesign DPV from scratch and make any changes as necessary - which is effectively Option 1 without the implied changes for existing users. A new namespace/URI is required e.g. w3id.org/gdpv. Drawback is that 'DPV' will no longer be maintained and all users will need to move to the new vocabulary. --- My thoughts: My answer to Q1 regarding whether we include non-personal data is - yes, but we only do Option 1 for the top-concepts and not create a comprehensive vocabulary for non-personal data. This is because I am in EU, am interested in DGA, but not interested in non-personal data aspects such as contracts and licensing. However, I see value in allowing DPV to be expanded to enable others to use it and expand on it for this while keeping the scope of DPVCG limited to personal data. Separately, I also am thinking about DPV in terms of changing how the vocabulary is current structured and named in the Github repo, e.g. instead of folders named /dpv-gdpr, /dpv-dga, etc. we have sensible structuring as: /loc/eu/gdpr, /loc/eu/dga, /loc/eu/ie and so on. Similarly, dpv-pd becomes just pd, dpv-legal and dpv-tech become legal and tech, and so on. This is not connected to the above, but since we are discussing changes to DPV, I am mentioning this in the same context. Regards, -- --- Harshvardhan J. Pandit, Ph.D Assistant Professor ADAPT Centre, Dublin City University https://harshp.com/
Received on Wednesday, 5 July 2023 07:35:18 UTC