W3C home > Mailing lists > Public > public-tracking@w3.org > October 2011

Re: ISSUE-5: What is the definition of tracking?

From: Jonathan Mayer <jmayer@stanford.edu>
Date: Fri, 28 Oct 2011 02:11:07 -0700
Cc: Sean Harvey <sharvey@google.com>, "public-tracking@w3.org Group WG" <public-tracking@w3.org>
Message-Id: <973335CA-1768-4AB3-BEA4-115CDC0AC2AC@stanford.edu>
To: David Wainberg <dwainberg@appnexus.com>
Here's an illustrative hypothetical.  Suppose, for each page it's embedded on, a third party logs a bunch of browser features (e.g. user agent, plugins, screen dimensions, etc.) plus the page URL.  And suppose the third party makes no attempt to pseudonymously identify users.  The third party suffers a data breach, and malcontents apply trivial fingerprinting algorithms to the data to reconstruct pseudonymous user browsing histories.

Note that the third party did not hold pseudonymously identified browsing histories - it held pseudonymously identifiable browsing histories.  But that still gives rise to real privacy risks.

On Oct 27, 2011, at 12:30 PM, David Wainberg wrote:

> I don't find it excessively nitpicky. It's relevant. Please elaborate. It seems that somewhere the data has to be associated in some way with a distinct user. 
> 
> On 10/27/11 1:14 PM, Jonathan Mayer wrote:
>> 
>> Fragmented or probabilistic tracking data might not be stored with a hash or other single identifier.  The privacy risk would, of course, be the same.  (I don't mean to be excessively nitpicky - a few months ago my team looked at a third party doing fingerprinting of just this sort.)
>> 
>> On Oct 27, 2011, at 9:02 AM, David Wainberg wrote:
>>> 
>>>> On Oct 25, 2011, at 2:13 PM, David Wainberg wrote:
>>>> 
>>>>> 
>>>>> On 10/24/11 8:18 PM, Jonathan Mayer wrote:
>>>>>> 
>>>>>> I would strongly oppose limiting our definition of tracking to only cover pseudonymously identified or personally identified data.  There are a number of ways to track a user across websites without a single pseudonymous or personal identifier.
>>>>> I'm not sure what you mean here. Can you provide examples?
>>>> Any means of tracking that relies on fragmented or probabilistic information.  For example, browser fingerprinting.  (See Peter Eckersley's paper "How Unique Is Your Web Browser.")
>>> Ah. I would have included that in pseudonymously identified, because if data is stored against it by the server, it will be stored against a hash or something based on the fingerprint.
>> 
Received on Friday, 28 October 2011 09:11:45 UTC

This archive was generated by hypermail 2.3.1 : Friday, 3 November 2017 21:44:41 UTC