W3C home > Mailing lists > Public > public-tracking@w3.org > October 2013

Re: Discussion of Issue-5 (definition of tracking), and a new CP

From: Brad Kulick <kulick@yahoo-inc.com>
Date: Tue, 8 Oct 2013 22:22:17 +0000
To: David Singer <singer@apple.com>
CC: Shane M Wiley <wileys@yahoo-inc.com>, "(public-tracking@w3.org)" <public-tracking@w3.org>
Message-ID: <8A32B85F-4694-4864-983D-BF76E54C21CB@yahoo-inc.com>
What about:

"Tracking is the retention or use, after a network interaction is complete, of data across non-affiliated websites that are, or can be, associated with a specific user, user agent, or device."

On Oct 8, 2013, at 11:15 AM, David Singer wrote:

On Oct 8, 2013, at 10:47 , Shane M Wiley <wileys@yahoo-inc.com<mailto:wileys@yahoo-inc.com>> wrote:


While I appreciate the more simplified approach, its exactly this type of definition that overly includes all possible manner of internet interactions within its scope.  I'd rather we better define the exact issue(s) where attempting to solve for and then allow that (or its opposite) to be definition of Tracking.

My reasoning for this approach is that the "broader" approach sets up a false expectation of what DNT is about.  If we nest the true operational reality of DNT too deeply within the standards document, I believe it's a fair belief that non-technical / non-policy professionals will not understand the nuance and will jump to the wrong conclusion when tracking is still occurring.

I favor a slightly more robust definition of tracking that tries to pull some of this nuance directly into the definition (cross-site, for example).

OK, I'm fine with debate.  But what do you mean by 'cross-site'?  I offered a definition before that was rejected ('tunnel vision' -- you are only allowed to record data that connects the user to your own site, not any other).  I kinda liked it because though it allowed some 'tracking' (recording of data about you) it seemed sufficiently narrow in scope that perhaps it was OK, it allows things like frequency capping, and it would enable us to discard the first-party/third-party distinction, as first parties typically only need to remember your interaction with them (and not any other party).  It's not the way we have gone, however.

So, have at it:  what is specifically too broad with the definition I gave, and what do you mean by cross-site?

- Shane

-----Original Message-----
From: David Singer [mailto:singer@apple.com]
Sent: Monday, October 07, 2013 5:53 PM
To: (public-tracking@w3.org<mailto:public-tracking@w3.org>)
Subject: Discussion of Issue-5 (definition of tracking), and a new CP

I think there are some good proposals on the table for this.  I'd like to add one, and a (short, I hope) discussion.

As introduction, I think it's helpful to lay out, informally, the 'creepy' that we have talked about.  It's something like "Wherever I go [on the web], organizations that I am unaware of, and for the most part cannot see, are watching me, keeping records, using those records later, and sharing them with others."

The 'multiple sites' comes in both that there are multiple organizations doing this, and that the same organization may be doing this on multiple of the sites I visit.

However, we need a definition of tracking that will:
* help to define what's possibly in scope, and definitely out of scope, for our specification;
* inform users as to what it is that this specification helps them express
* inform site operators what it is that this specification allows users to ask them.

For the third, I think it important that the definition allow a *single* site to answer the question "am I tracking?" and hence "am I possibly affected by this specification?"  Thus the question of 'multiple sites' should not, I think, factor into the definition.  Yes, the user is freaked that lots of sites are doing it, but we need to give a definition that a single site can work with.

** I'd like therefore to have a 'null change proposal' on the table for the existing definition:

"Tracking is the retention or use, after a network interaction is complete, of data that are, or can be, associated with a specific user, user agent, or device."

I think we had this on the table for so long that what I meant by it may be lost in the mists of time.  I wanted to try for a definition that was short, simple, and set a clear edge for what is *out* of scope. By "network interaction" I mean literally an HTTP request and its response (or similar for other protocols).  It's clearly impossible to respond if (for example) you cannot remember the IP address of the requester, and indeed it seems that using the data *present in the transaction* for the purposes of responding to that transaction should be OK and thus out of scope.  The creepy thing is the remembering, not the immediate action (if any, such as serving an ad).  So this definition leaves as clearly out of scope (a) data used in the servicing of the transaction and (b) data after the transaction which doesn't tie back to the user (or his device or user agent).

However, the definition in 2.3 "Network Transaction" goes on to define "Network Interaction", as a set of request and response, so these two don't tie together properly.  In particular, I don't think a server has a way of knowing when the user agent has done asking for resources;  does HTTP have a well-defined concept of a "single web page"?  So there are two problems here:  (a) the section name and the definition mis-match and (b) the termination is unclear to the server.  So, I think we need the supporting definition:

Network Interaction: an HTTP request and its response, or the equivalent in another protocol.

I would like this change also added to the 'null change' CP, i.e. make the definition of 'network interaction' crisp and machine-testable.

I personally prefer 'data' as singular, but the existing CP (5) that claims to be a editorial revision of this doesn't seem to be that at all:

"Tracking is the collection of data across multiple parties' domains or services and retention of that data in a form that remains attributable to a specific user, user agent, or device."

Could CP(5) be labelled as an alternative, please?  I don't see it as editorial.

Reading the existing definition, and the CPs, I don't think (with the possible exception of 'leave it undefined') we are very far apart.  I rather suspect we can come to a consensus on something a little more explanatory than the existing text, while not being quite as long or complex as CP(1).  In particular, I think it's fine to set up a rough delineation of what is clearly out of scope (in-transaction, data not associated with the user etc.) and then delineate in the text the precise permitted data, permitted uses, and all the nuances for first and third parties.

David Singer
Multimedia and Software Standards, Apple Inc.

David Singer
Multimedia and Software Standards, Apple Inc.
Received on Tuesday, 8 October 2013 22:23:17 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:40:00 UTC