W3C home > Mailing lists > Public > public-tracking@w3.org > January 2012

[Issue-5] [Action-77] Defining Tunnel-Vision 'Do Not (Cross-Site) Track'

From: David Singer <singer@apple.com>
Date: Sun, 29 Jan 2012 16:15:17 +0000
Message-id: <0EEE950D-D696-4C60-B101-14ED3F11F5DD@apple.com>
To: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
This is a revision of my previous email, and a response to Action-77, which is one of 6 (?) actions related to Issue-5.  Please ask questions as needed to clarify, and I will write a composite revised definition, so we can close Action-77, and (once that's been done for the other formulations) Issue-5.

This is an alternative to restricting tracking via a 1st/3rd party distinction. I want to emphasize, I am doing this to explore and learn, not to 'promote' any particular direction.  I hope people find it helpful.

(All these definitions etc. rely on being able to define "site" or "party", by the way.  I don't see how to escape that, as many have pointed out, since it's within a 'party' that information flows, and so on.)


Informally, we allow sites only to record what they do and learn *directly* about the interaction between themselves and the user. 

The formal rule is this:

When DNT is on (1):
Data records that both identify or could identify, a single USER, and also identify, or could identify, a single SITE (that is part of a Party),
* MUST identify or be capable of identifying no other Party, or site that is part of any other Party;
* MUST be derived only from transactions directly between the identified Party and the user, possibly combined with publicly available data, 
* MUST be available/accessible only to/by the identified Party,
* MUST NOT contain user-specific non-public information derived or passed, directly or indirectly, from any other Party, 

If the data is held by another party on behalf of the identified party, that holding party MUST have no rights to use the data.

Records derived when DNT is on (1), MUST be held separately from other data derived when DNT is not on (1).


not needed:

Outsourcing exception: not needed, it's part of the rule in the first place.
1st-party exception: not needed: all sites/parties are allowed to remember the user's interactions with them.
Unidentifiable data exception: not needed, as the definition here only concerns user-identifiable data in the first place (which can probably be true for all rule sets)
Operational exceptions:
  frequency capping, story-boarding: not needed; the ad site is permitted to remember what IT served YOU, just not a lot of why (which 1st party you were on, etc.)
  financial logging: separate un-identified records can be kept on the number of impressions on a 1st-party site (why is this not true for all proposals?)
  3rd party auditing: again, is it necessary to keep a record that identifies a specific user?

potentially needed:

Operational exceptions:
  security/fraud: an exception may be needed here, especially if cross-site fraud is to be detected
  research/market-analytics: we don't have a current formulation, and the title is broad enough to allow almost anything, so I can't tell
  product improvement: this is an issue, again with a serious risk of slippery slope
  debugging: yes, an exception may be needed for debugging
Legal exception: tracking to the extent required by law

Comments on TUNNEL-VISION 

If a user runs sometimes with DNT:0 and sometimes DNT:1, they will end up with two records at sites, one with a lot of other-site data, and the second record with tunnel-vision.  Correlation by the site would enable merging these; this is the weakest aspect of this strawman, IMHO.  Under the alternative 'cross-site' formulation, I think each site would keep N+1 records (1 for when DNT is off, and N for the number of 1st party sites 'seen' by this 3rd party for this user).

Frequency capping and storyboarding by advertisers are now permitted; you ARE allowed to remember what ad you showed this (anonymous) user, since that was *your* transaction.  You're limited in remembering only site-generic 'why' -- you cannot remember 'they visited Sears and so I showed a dishwasher advert'.  

If the user starts interacting with *you*, you can remember that also; we don't need language to make this an exception, or 'promotion' from 3rd to 1st party.

Redirection services can remember basically only that the user was active on the web, since everything else they know (the original URL, the re-direct) either identify or could be used to identify another site.

The attraction of this rule is that many fewer exceptions are needed.  The downside of this formulation is that it relies on sites not to re-correlate the records, though there is still a lot of data that cannot be recorded.

David Singer
Multimedia and Software Standards, Apple Inc.
Received on Sunday, 29 January 2012 16:15:55 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:38:30 UTC