Re: anonymous or no?

On Aug 12, 2010, at 18:02 , Renato Iannella wrote:

> On 12 Aug 2010, at 04:49, David Singer wrote:
>> But now, as we know, people are getting very good at re-identification.  Clearly I don't like it if someone says "I'm 95% sure that the guy who bought these five books, is that Dave Singer who attends the W3C".  I'd like to say "not only must my records be anonymized, but re-identification should not occur either".
>> But this flies directly in the face of a very long-established principle, that the analysis and drawing of conclusions from public data is a legitimate, indeed even intended, usage of that public data.  And setting that rule would also drive re-identification "underground" -- people would still do it, they just wouldn't publish the results, which is *worse*.
> The analysis of "public data" would (should?) never get to the stage of identifying individuals.

If we write this as a rule, we'll just see people saying "the person who bought these pharmaceuticals is someone who lives in australia and does standards work at the w3c and has a particular interest in rights-expression-languages" but we didn't IDENTIFY him, oh no.  Where is the line?

> In your use case, they may come up with "People who buy HTML books are more than likely to be associated with a Web Standards group"
> If they did - and could identity - you as an individual, then the original "anonymisation" was flawed, and they are liable to the orginal agreement....

But the people doing the analysis and identification might be doing it based on public (anonymized) data and be under no agreement at all.  Is the company that released (correctly) anonymized data liable for the ingenuity of others in using that data?

David Singer
Multimedia and Software Standards, Apple Inc.

Received on Friday, 13 August 2010 15:58:25 UTC