W3C home > Mailing lists > Public > public-tracking@w3.org > October 2013

Issue-5, trying to find a middle ground

From: David Singer <singer@apple.com>
Date: Fri, 11 Oct 2013 15:43:35 -0700
Message-id: <EB36A818-873F-411E-B8BA-A2C0648A417B@apple.com>
To: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Looking at the change proposals at http://www.w3.org/wiki/Privacy/TPWG/Change_Proposal_Tracking_Definition, I tried to find the key ideas and points in them.  Since there is at least one significant point of difference, I have worked this as a formal definition, but of course we could say

"In rough terms, tracking is …" "The precise definition is effectively the effect of the rules that this document defines for parties that conform to this recommendation."


Key Ideas

1)     Roy’s definition (1) correctly uses a word other than site; we don’t mind data flowing within the context of a single controller.
2)     I think Roy’s definition (1) is saying that it’s the connecting of the user with one or more (‘multiple’) contexts other than the context that received the transaction that’s a problem. This is roughly what I previously described as ‘tunnel vision’: connecting the user with any other context than the recipient. This might be OK, and it certainly solves a long-standing problem with my (3) – ‘normal’ logs get pulled into the dragnet in (3), which is unpleasant, and maybe unworkable.
3)     My old definition (3) intends a network interaction to be the HTTP request/response; the current draft has it as a ‘page’, and servers don’t know about pages, and anyway, what is supposed to happen if the DNT signal is inconsistent on the various requests for the parts of a page?  We need a decent definition of ‘retention/retain’ and of ‘network interaction’: ‘network interaction’ or ‘network transaction’ is an HTTP request and its response, and ‘retain/retention’ is holding data after the ‘network transaction’ is complete.
4)     Rob’s definition doesn’t have the carve-out for responding to the transaction; however, he seems to be arguing in the non-normative text that even for that the site should not gather extra data about the user. I am not sure this is tenable; converting an IP address into a location and hence into a time of day would be gathering extra data, and we may want to allow that (we probably do).
5)     Roy’s definition of context in (1) seems to be what we should define as a ‘party’ (see separate issue and discussion).
6)     We don’t want data shared around even in-transaction, so we need a definition of sharing that simply says that the data crosses contexts.
 
[My previous tunnel-vision was described in http://lists.w3.org/Archives/Public/public-tracking/2012Jan/0227.html.]

Existing definitions:

This has a serious problem, which makes the subsequent one unmanageable:
 
“A network interaction is the set of HTTP requests and responses, or any other sequence of logically related network traffic caused by a user visit to a single web page or similar single action. Page re-loads, navigation, and refreshing of content cause a new network interaction to commence.”
 
A server has no idea what a page is; it gets requests for resources and responds.  In particular it has no idea when a page load is complete, so the termination of this is indeterminate:
“A party retains data if data remains within a party's control beyond the scope of the current network interaction.”


Suggestion:

If I read it right, we have a choice here, between (in colloquial terms)
a)     stop remembering data about me
b)     stop remembering where I have been on the web (apart from you)
Since the first has obvious practical issues, let’s see if we can settle on the second (which has obvious ‘slightly tracking’ issues).
 
New/improved definitions:
 
‘Network Transaction’: an HTTP request and its matching response, or the equivalent in another protocol;
‘Context’: a set of resources that share the same data controller and a common branding [but this probably should become the definition of ‘party’]
‘Retain/retention’: data is retained if it is held after a Network Transaction is complete
‘Share’: data is shared if it is passed by one Context to another Context
 
[I don’t think we need, for this definition:
‘Collect’: data is collected if it was not present in the transaction but is subsequently retrieved and associated with it
‘Use’:  if you have not retained the data, you can’t use it, and we need to allow the data to be used to service the transaction in which it occurs]

So, here we go:
 
Tracking is the Retention or Sharing of data, of a given request, that can associate (A) a Context other than the context that received the request, with (B) a particular user, user agent, or device (maybe ‘the user, user agent, or device that made the request’?).
 
 
Note, that this does allow some retention of personal data. Under this rule, a site can keep ordinary log files that include such normal fields as IP address, user-agent, and so on. Frequency capping is also possible; as long as you remember only data that associates the user with the ads you served, you are fine. You cannot associate the user with other sites, however, notably first party sites.

As I say, we could bracket this with "in rough terms…"

David Singer
Multimedia and Software Standards, Apple Inc.
Received on Friday, 11 October 2013 22:44:05 UTC

This archive was generated by hypermail 2.3.1 : Friday, 3 November 2017 21:45:19 UTC