On Oct 28, 2011, at 7:06 AM, Amy Colando (LCA) wrote:
> Thanks Jonathan. Isn’t it the case that these browser features would still have to be related to some sort of identifier – whether on client side or server side – in order for the information to be identifiable? And therefore we can stick with pseudonymous, passively collected data?
>
> IOW, if all I have is an aggregated laundry list of browser features that are used by multiple users, where there is no way to say that a set of browser features belongs to a particular record because of the way the identifiers have been removed or the log files scrubbed, then is there a way to relate to a specific identifier? I think your example below requires the log files to associate the browser features to a particular record or browser, but wanted to make sure I am thinking about this correctly.
I think the point that Jonathan is making here is that, no, you don't need to associate that list of browser details to a particular identifier in order for the information to be pseudonymously identifiable. If you record the same browser details (or even a subset of those details) for future logs, then you or some malicious entity could reconstruct a pseudonymous history.
You might be interested in http://panopticlick.eff.org/ and the follow-up paper from Peter Eckersley.
(Apologies if I'm misunderstanding you here, Amy; if you're talking about aggregated records or scrubbing log files to remove records associated with a particular browser configuration, then the data may in fact not be easily re-identifiable.)
Thanks,
Nick