Re: Deidentification (ISSUE-188)

On Jul 16, 2014, at 7:44 AM, TOUBIANA Vincent wrote:

> Hi Justin,
>  
> I’d like to propose a definition of de-identification which is closer to the concept of anonymization defined in the Article 29 Opinion (http://ec.europa.eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf).
>  
> A data-set is de-identified when it is no longer possible to:
> - isolate some or all records which correspond to a device in the dataset,
> - link, at least, two records concerning the same device,
> - deduce, with significant probability, the value of an attribute from the values of a set of other attributes.
>  
> The third criteria may -- in some cases -- go beyond de-identification but the first two are, in my opinion, required to limit re-identification risks.

No.  A set of log entries for a single request might consist of ten
to twenty records with the same request-id, which is no more an indication
of tracking than having a single very large record with the same information.
The mechanism of records has no relevance to the actual privacy concern,
which is that the data can be linked to a particular user.  How many
records that involves, or how many deductions are needed, is superfluous.

In any case, I have no idea what the third bullet means, and I am pretty
sure that I would not consider it de-identified if the data set included
a single record with my name and home address.

....Roy

Received on Thursday, 17 July 2014 21:09:46 UTC