Re: Categorisation of Personal Data

Hi Niklas, 

This is a very nice intro to the complexity with crafting data categories.

I do have some input from experience with the Consent Receipt Specification at Kantara.  

This spec was written as a collaboration that started with a presentation at a W3C conference Do Not Track and Beyond in 2012.  It was used to lobby for strong consent laws in the GDPR against massive industry opposition.  The Consent Receipt Spec uses ISO 29100 Lexicon, but is really based on International Security and Privacy Analysis (ISTPA) Operational Analysis of Privacy Principles <https://www.dropbox.com/s/tlheav5mj6o3kko/ISTPAAnalysisofPrivacyPrinciplesV2.pdf?dl=0> _(the bible I used for writing specs) . This Guidelines eventually became the ISO 29100 framework, which then led to the GDPR and is a body of work that has developed over 2 decades.   

When we wrote the Consent Receipt specification, we needed to create a first set of categories, in order to first create a consent receipt. These categories were appended to the specification in 2014.  

The first was a list that we used  that turned out to be a mixture of categories and purpose descriptions. ( it took a while to realise that)

The second set came from - EnterPrivacy, which was a massive improvement on our list. 

The third version, or the one we use at OpenConsent right now, is a baseline version, which we have been using in use cases although, I have also started variations  that are context or domain specific for different client.  i.e. healthcare, evidential (security services), finance - but have not progressed much in hopes this is something this work group would take a stab at the problem. 

The categories we are using currently are useful for explaining things to people but not useful for aggregate analysis. 

In the end there is a shortcut with some value, in that a personal data category could be defined in a code of practice which defines the set of categories and terms for a group of entities in one domain.  But this has limited value at aggregate outside of that domain.  


I posted 3 references  here on the WIKI <https://www.w3.org/community/dpvcg/wiki/Categories_from_OpenConsent,_Enter_Privacy_&_Kantara_Consent_Receipt>  and linked these items into a shard folder 

1. MVCR v0.7 - Appendix Data Categores (2014)  <https://www.dropbox.com/s/mje8cdfv7jy1q0d/KI-CISWG-Editorial-MVCR-V0_7-20150907.doc?dl=0>
2. Person Data Categories - (2015) <https://www.dropbox.com/s/f8moink9fhubdnc/Categories-of-personal-information_2015-EnterPrivacy.pdf?dl=0>
3.OpenConsent: Basic Categories (2017) <https://www.dropbox.com/s/azd72e4nc6rs1mk/OpenConsent%20-CATEGORIESOFPERSONALINFORMATION-161118-1431.pdf?dl=0>



Mark








> On 16 Nov 2018, at 12:31, Niklas Kirchner <niklas.kirchner@wu.ac.at> wrote:
> 
> Dear All,
> 
> in the next couple of weeks we intend to work extensively on the categorisation of data, in part in working sessions here in Vienna since Harsh is coming from Ireland to visit the WU and Elmar is around at the TU Vienna. As some of the work will be done F2F we will keep you updated and would like to invite everyone to get involved in the process.
> 
> A first step would be to discuss and maybe identify the crucial challenge for a standard categorisation of personal data to work. According to the GDPR Art.4 No. 1, personal data is [A] “any information relating to an identified or identifiable natural person” or [B] “different pieces of information, which collected together that can lead to the identification of a particular person."
> 
> This definition already indicates a problem of fragmentation as well as identification that we would like to address and eventually solve. One way, as it is somewhat standardly done, is to tie up the categorisation to purposes or contexts in which the personal data appears such as in finance, health or judicial data. One the one hand, this seems justified since for example health data is a subclass of sensitive data according to the GDPR and requires the explicit consent of the individual in question. On the other hand, data elements may appear in various contexts and thus are not easily pinned down as personal just because of this context. It seems that processing and statistical augmentation play a big role as well when considering profiling, scoring but also behavioural data. Another difficulty arises with the public vs. open character of some of the personal data.
> 
> Since Mark already mentioned that he struggled with finding a satisfying categorisation on his own, it would be very helpful to receive such unfinished attempts to get a better grasp on the challenges and requirements. Looking forward to hear from you!
> 
> Best,
> 
> Niklas
> 
> -- 
> Niklas Kirchner
> Institute for Information Systems and New Media
> Vienna University of Economics and Business
> 
> Email: niklas.kirchner@wu.ac.at
> 
> 

Received on Friday, 16 November 2018 14:51:40 UTC