Re: [Issue-5] [Action-77] Defining Tunnel-Vision 'Do Not (Cross-Site) Track' from David Singer on 2012-02-03 (public-tracking@w3.org from February 2012)

From: David Singer <singer@apple.com>
Date: Fri, 03 Feb 2012 10:02:48 +0000
To: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-id: <A51F4534-1098-4D96-A596-BB5D08D9932C@apple.com>
Roy

thanks for the thoughtful comments.  A few replies inline.

On Feb 3, 2012, at 2:25 , Roy T. Fielding wrote:

> On Jan 29, 2012, at 8:15 AM, David Singer wrote:
> 
> 
> I disagree with the way that this solution is being described.
> I don't see why you've added Party all over it.

I think I explained in the introduction;

>> (All these definitions etc. rely on being able to define "site" or "party", by the way.  I don't see how to escape that, as many have pointed out, since it's within a 'party' that information flows, and so on.)

It is a term to describe the envelope of the organization within which data flows.  I assume you are not proposing that images.example.com and store.example.com should be required to keep separate data, as they are separate 'sites'.  You seem to prefer 'service'; I don't mind what word is used:
> Likewise, there is no need to talk about Party.  There is a service
> to which the user provided data.  The user has given consent to that
> service to make use of that data.  Party is irrelevant.  Owner is
> irrelevant.  Operator of the service is only relevant to the extent
> that they are the ones responsible for adhering to the constraints.


> In particular, the notion that the data collected must not identify
> any other site, in general, won't work very well because referrals
> are essential and it is very difficult to control data that the
> user might enter in a text dialog.

As I say, these rules permit that you record anything that you directly told the user or the user directly told you.  Referer record are troubling, I agree.

>  I think we should specifically
> constrain referral data alone (as provided in URI, Referer, or Origin)
> and have the constraint be about operational use limits rather than
> collection, and that retention be limited to operational needs.

OK.  I agree that 'referer' needs consideration.

> 
>> If the data is held by another party on behalf of the identified party, that holding party MUST have no rights to use the data.
> 
> Too much partying.
> 
> Data collection may be contracted to some entity other than
> the site operator (just as site operation may be contracted to some
> entity other than the domain owner).  Such outsourced operations
> are considered to be the *same site* if the data collected is siloed
> to that site, is controlled by the same entity that controls the site,
> and the data processor acting as that site's agent is contractually
> bound to do all the things (as previously discussed under outsourcing)
> that makes them a data processor and not a data controller.

I think you're trying to read this in the context of the existing text, which has a 1st/3rd party distinction, whereas it is a strawman alternative.  I carefully wrote this so that out-sourcing is built-in.  You may be right, that using EU terms would be better.

> 
>> Records derived when DNT is on (1), MUST be held separately from other data derived when DNT is not on (1).
> 
> That's not possible in general.
> 
> I'm not sure what that is trying to accomplish.  If it is just to
> prevent re-identification, then simply "MUST NOT be combined with
> records collected when DNT is not enabled" should be sufficient.

That's a re-formulation of what I wrote, isn't it?

> Or perhaps, reading on, what you mean is that "a site that retains
> user-specific data MUST distinguish users with DNT enabled from
> users with DNT not enabled, such that they are considered different
> users and their associated records are never combined."  Note that
> this will have an effect on user experience, though I think it
> is a reasonable one.

Yes.

> Note that this is true, in general, of all the suggested definitions
> for tracking.  We have no ability to prevent bad actors.  We can only
> state the constraints and hope that regulators and journalists deal
> with those that fail to adhere to the constraints.  If one of the
> constraints is that a site MUST NOT correlate DNT-on data with DNT-off
> data, then that is just as effective as a constraint that says a 
> third-party cannot collect the same data.  An evil party will, regardless.


Sure.  Since this is all about what happens in a database invisible to you, there is a certain amount of trust involved.  The evil can and will say "we comply" and keep building the database.  We can't protect ourselves from people who break the rules in secret.

The concern is about organizations not intending to be evil, however, and the risk level.

If nothing is recorded, then there can be no privacy leak of records, because there are no records.
If per-user records are retained, but the records retained under DNT are 'anonymous' and have no obvious link to the non-DNT records, then the risk is present but low.
If per-user records are retained, but the DNT-on and DNT-off records merely kept separately but easily re-linked (e.g. by correlating IP addresses), the risk rises.
…and so on.

I'm pointing out that under this formulation, I think the third case is where we land.

David Singer
Multimedia and Software Standards, Apple Inc.
Received on Friday, 3 February 2012 10:03:29 UTC