- From: Rob van Eijk <rob@blaeu.com>
- Date: Thu, 30 May 2013 21:21:37 +0200
- To: Shane Wiley <wileys@yahoo-inc.com>
- Cc: <public-tracking@w3.org>
Hi Shane, The text is work in progress. I agree with you that avoiding the term anonymization is better. anonymization slipped in due to an earlier revision of the text. Rob. Shane Wiley schreef op 2013-05-29 18:49: > Rob, > > I'm not very supportive of the ISO definition in this regard but > let's leave that definition alone for the moment. > > I follow your description fully until you introduce a net new term in > the description: "anonymization". Why do this? If you avoid this and > use your definition of "identifiability", our definitions are far > closer (which I believe is a logical conclusion based on your original > definition of identifiability). > > Thoughts? > > ----- > <unchanged> > Identifiablity: > Linkable is not the same as identifiable. To determine whether a data > is identifiable, account should be taken of all the means which can > reasonably be used by the privacy stakeholder holding the data, or by > any other party, to identify that natural person. > </unchanged> > > <updated> > De-identification: > De-identification is a process towards removing identifiability. > > De-identified data: > De-identified data is data that is not reasonably identifiable to a > natural person > </updated> > ----- > > If these changes are accepted, then we'd slightly modify the body of > the proposed text: > > ----- > <Moved to Normative Text - Updated> > The RED state data may contain data unaltered from initial > collection. In order to go from the RED state to the YELLOW state, > direct identifiable information MUST be removed to move the data to a > de-identified state. YELLOW state MAY contain information indirectly > linked to an individual, computer or device, but in of itself is not > identifiable. GREEN state data is de-identified and unlinked data and > MUST NOT contain identifiable information. Any risk for > re-identification of fully de-identified data MUST be regularly > assessed and mitigated through Privacy Risk Management. > </Normative> > ----- > > - Shane > > -----Original Message----- > From: Rob van Eijk [mailto:rob@blaeu.com] > Sent: Wednesday, May 29, 2013 9:28 AM > To: public-tracking@w3.org > Subject: RE: ACTION-406: Propose a new set of names around yellow > state > > > Following up on the call today, here are the notes that I put forward > for the minutes. (too much to paste into irc). > > PII: > This standard refers to the ISO 29100 (privacy framework) definition > of personally identifiable information (PII): > any information that (a) can be used to identify the PII principal to > whom such information relates, or (b) is or might be directly or > indirectly linked to a PII principal. > NOTE To determine whether a PII principal is identifiable, account > should be taken of all the means which can reasonably be used by the > privacy stakeholder holding the data, or by any other party, to > identify that natural person. > > Linkability: > Linkability is about the ability to add new data to previously > collected data > > Identifiablity: > Linkable is not the same as identifiable. To determine whether a data > is identifiable, account should be taken of all the means which can > reasonably be used by the privacy stakeholder holding the data, or by > any other party, to identify that natural person. > > De-identification: > De-identification is a process towards anonymization. > > De-identified data: > De-identified data is data that is not linked or reasonably linkable > to an individual or to a particular computer or device. > > To accomplished de-identification, I propose a 3 step model. > The mental model contains 3 types of data: red, orange and green data. > > The RED state data may contain (a) and (b). In order to go from the > red state to the yellow state, direct identifiable information MUST be > removed, e.g. an email address or a phone number. > The YELLOW state data is partly de-identified, and MAY contain > information indirectly linked to an individual, computer or device, > e.g. > a partly de-identified but still linkable unique identifier, such as > a hashed pseudonym. > The GREEN state data is fully de-identified data and SHOULD NOT > contain personally identifiable information (PII). Any risk for > re-identification of fully de-identified data MUST be regularly > assessed and mitigated through Privacy Risk Management. > > In order to move from red to yellow, or from yellow to green, one > needs process the data. There are multiple ways to do that: > > 1. One example is based on concatenating a random number to the > unique ID. This results in a lookup table of unique ID <-> random > number. > Getting from yellow to red is braking the link (un-linkiability) by > throwing away the unique ID. No new data can be linked to the > un-linkable data in the green. > > 2. Another example is based on rotating hashes. Getting from red to > yellow is applying the hash. Getting from yellow to green is braking > the link (un-linkability) by throwing away the salt. No new red data > can be linked to the un-linkable data in the green. > > In terms of unlinkability versus de-identification it remains > important to seperate the two concepts: > - de-identification helps in the event of a data breach, when a > dataset is out on the street due to e.g a databreach. It is a way to > address the reasonable requirements of an adequate level of > protection. > - an adequate level of protection is completely different from > unlinkability. Unlinkability is connected to the notion of personally > identifiable information.
Received on Thursday, 30 May 2013 19:22:18 UTC