- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Thu, 17 Jul 2014 14:09:22 -0700
- To: TOUBIANA Vincent <vtoubiana@cnil.fr>
- Cc: "Justin Brookman" <jbrookman@cdt.org>, <public-tracking@w3.org>
- Message-Id: <55AB5613-EAA0-40DD-8747-31E29BD5C792@gbiv.com>
On Jul 16, 2014, at 7:44 AM, TOUBIANA Vincent wrote: > Hi Justin, > > I’d like to propose a definition of de-identification which is closer to the concept of anonymization defined in the Article 29 Opinion (http://ec.europa.eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf). > > A data-set is de-identified when it is no longer possible to: > - isolate some or all records which correspond to a device in the dataset, > - link, at least, two records concerning the same device, > - deduce, with significant probability, the value of an attribute from the values of a set of other attributes. > > The third criteria may -- in some cases -- go beyond de-identification but the first two are, in my opinion, required to limit re-identification risks. No. A set of log entries for a single request might consist of ten to twenty records with the same request-id, which is no more an indication of tracking than having a single very large record with the same information. The mechanism of records has no relevance to the actual privacy concern, which is that the data can be linked to a particular user. How many records that involves, or how many deductions are needed, is superfluous. In any case, I have no idea what the third bullet means, and I am pretty sure that I would not consider it de-identified if the data set included a single record with my name and home address. ....Roy
Received on Thursday, 17 July 2014 21:09:46 UTC