W3C home > Mailing lists > Public > public-tracking@w3.org > November 2013

Re: ISSUE-5: Consensus definition of "tracking" for the intro?

From: Roy T. Fielding <fielding@gbiv.com>
Date: Tue, 5 Nov 2013 14:39:34 -0800
Cc: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-Id: <498CCC32-5E0E-4B5D-846E-62F9EE408074@gbiv.com>
To: David Singer <singer@apple.com>
On Nov 5, 2013, at 12:57 AM, David Singer wrote:
> On Oct 18, 2013, at 9:33 , Roy T. Fielding <fielding@gbiv.com> wrote:
> 
>>> So, concretely, a hidden third-party tracker on a page can remember that you visited that page, or not?  If not, can it remember the nature of the site you visited (it was a guns and ammo kind of site)?  When you made the transaction?  Your IP address, geolocation, local time of day, user-agent, ?
>> 
>> All of that data is user activity in the first party context.
> 
> I was asking specifically about the 3rd party, recording what it gets in, and derived from, the HTTP requests.

The first two questions were about the context in which the user's
activity took place.  If the third party only collects such data at
one location (e.g., contracted single-site web analytics), then
it is not tracking because it can't observe the user in any other
context.  If the third party does observe a particular user's activity
in any other context, or if it's definition of "you" comes from some
other context, then it is tracking.

The time an embedded request is made is not itself tracking.
The IP address received in a request from a single context is not
tracking because it does not (by itself) cause the user to be observed
at multiple contexts. Geolocation would naturally depend on its
granularity, but it isn't really an issue for third party requests.
Recording a user agent string is not (by itself) tracking.  Using
any of the above to construct a tracking algorithm for the sake of
following a user across multiple contexts via data correlation
is tracking.

Again, let me reiterate: my proposed definition is not limited to
a single network interaction, nor does it depend on DNT.  It merely
states conditions that a USER would consider tracking, and would
expect not to be applied for any data collected with DNT:1 set.
At the same time, I am not including the collection of personal
data for the sake of a single context, even if that personal data
is being collected by a third party, because it is only for the sake
of that single context (and not accessible outside of that context).
The fact that such data could, in fact, be used by someone else
to track the user simply doesn't matter -- as soon as they do so,
they are tracking, as defined.

DNT is not a security protocol.  It does not prevent tracking.
It only defines and expresses a desire, and that desire will either
be obeyed or not.  It is therefore impossible to construct a screw
case wherein a user is actually being tracked across multiple
contexts that would somehow escape the definition as proposed.

>> If the
>> third-party tracker observes it, then any of the following will cause
>> it to be tracking under this definition:
>> 
>> 1) the third party observes the user's browsing activity in any
>>    other context, including one where it is the first party;
>> 
>> 2) the data is provided to anyone other than the first party and
>>    they combine it with observations obtained from any other context.
>> 
>> This is analogous to walking down the street, seeing a person with
>> an unusual t-shirt, saying Hi, and continuing on with your walk.
>> If you don't see that person again (or at least don't recognize
>> them in a different shirt), then it cannot be tracking.  If you
>> do see them again, at the same location, then it still isn't tracking.
>> If, however, you see and recognize them again in a different location
>> and choose to remember that fact, then you have tracked them.
> 
> Tracking only happens the 2nd and subsequent times??

At least two locations (contexts) is necessary to be considered
tracking.  Otherwise, we aren't talking about a track.  A single
point is not a track.

>>> This seems to permit the accumulation, by third parties, of a lot of data about the user, and I am unsure if that's your intent, or it's accidental, or a misread on my part.
>> 
>> Yes, a third party can learn the data provided by the user agent in
>> a specific context.  The immediate example of that is contextual
>> advertising, which we already agreed is not tracking.
> 
> That's not learning, that's using the data in the transaction itself.

There is no relevant distinction between those terms.

>> Note, however, that all of your examples assume that they also know
>> who "you" is.  Why do you think the third party would know that
>> information?
> 
> I tend to provide an IP address and other distinguishing data, such as a fingerprint, in transactions.  The third party could have cookied me the first time, too.

At a different context, right?

>> If they are relying on any other information, from any
>> other source, that has the effect of identifying you, then they are
>> already tracking according to that definition.
>> 
>>>> The reason it is there is because
>>>> the verb tracking and the privacy concern we are trying to address
>>>> are both about identifying the trail of an individual as they
>>>> proceed from place to place.  Specifically, remembering that a
>>>> person was at a single place is not tracking unless that memory
>>>> is shared with someone else or combined with memories of other
>>>> places.
>>> 
>>> But the next and subsequent times I visit a site that has the same third-party tracker on it, and they are allowed to remember some data that's associated with me, how is it NOT forming a trail?
>> 
>> Because it is the same context.  The fact that a given user agent
>> visited the same site more than once is not a privacy concern if
>> the third party doesn't know anything else about the user.
> 
> No, I visit two DIFFERENT sites with the SAME 3rd party tracker.  What is that 3rd party allowed to remember under your definition?  What is NOT tracking data?

Retention of anything about the context in which those embedded requests
were made that remains tied to a particular user would amount to tracking,
regardless of how that data is obtained.

> Saying the 2nd transaction is tracking is double-speak; I don't know it's the 2nd unless I can correlate it with the first.

Which is why your questions are not relevant to tracking.  If you want
to define a protocol for anonymous browsing, don't call it "Do Not Track".
The privacy concern we are addressing in *this* protocol is the personal
knowledge gained by observing a user across multiple contexts
(i.e., the ability to learn something new about, or build a profile on,
the user by observing their activity at multiple contexts).

....Roy
Received on Tuesday, 5 November 2013 22:39:56 UTC

This archive was generated by hypermail 2.3.1 : Friday, 3 November 2017 21:45:20 UTC