Re: [Issue-5] [Action-77] Defining Tunnel-Vision 'Do Not (Cross-Site) Track' from Bryan Sullivan on 2012-02-04 (public-tracking@w3.org from February 2012)

From: Bryan Sullivan <blsaws@gmail.com>
Date: Sat, 04 Feb 2012 11:12:58 -0800
To: David Singer <singer@apple.com>, "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-ID: <CB52BBE7.10C8D%blsaws@gmail.com>
David,

"envelope of the organization within which data flows" does capture the
intent of "party" I believe, and to the extent that users are aware of it
(to allow somewhat this measure, which I still have reservations about
being objective), is what users intend when they agree to share their data
with a "service" offered by a "service provider". The concept of a unique
service provider IMO defines the envelope that the user perceives.

For example, AT&T is clearly the service provider that offers services
through yellowpages.com (due to clearly visible branding at least).
Similarly, while att.net is "Powered by Yahoo!" (and redirects to a yahoo
subdomain, http://att.my.yahoo.com/), AT&T is clearly the service provider
through branding, privacy policy, T&C/AUP, etc. Yahoo is a an
infrastructure provider and outsourced provider of ads, with user
preferences managed under informed consent through AdChoices. These
relationships are all very easy to see and understand for the average user
IMO: there is a clear relationship between att.net and Yahoo as outsourced
Ad provider, and user choices are clearly manageable through AdChoices. As
long as consenting users can trust that their personal information flows
no further, I believe the average user will be entirely comfortable with
the flow between att.net and Yahoo.

Within that assumption, I see no reason to restrict use of the referrer
info, since it is essential for BAU.

On 2/3/12 2:02 AM, "David Singer" <singer@apple.com> wrote:

>Roy
>
>thanks for the thoughtful comments.  A few replies inline.
>
>On Feb 3, 2012, at 2:25 , Roy T. Fielding wrote:
>
>> On Jan 29, 2012, at 8:15 AM, David Singer wrote:
>> 
>> 
>> I disagree with the way that this solution is being described.
>> I don't see why you've added Party all over it.
>
>I think I explained in the introduction;
>
>>> (All these definitions etc. rely on being able to define "site" or
>>>"party", by the way.  I don't see how to escape that, as many have
>>>pointed out, since it's within a 'party' that information flows, and so
>>>on.)
>
>It is a term to describe the envelope of the organization within which
>data flows.  I assume you are not proposing that images.example.com and
>store.example.com should be required to keep separate data, as they are
>separate 'sites'.  You seem to prefer 'service'; I don't mind what word
>is used:
>> Likewise, there is no need to talk about Party.  There is a service
>> to which the user provided data.  The user has given consent to that
>> service to make use of that data.  Party is irrelevant.  Owner is
>> irrelevant.  Operator of the service is only relevant to the extent
>> that they are the ones responsible for adhering to the constraints.
>
>
>> In particular, the notion that the data collected must not identify
>> any other site, in general, won't work very well because referrals
>> are essential and it is very difficult to control data that the
>> user might enter in a text dialog.
>
>As I say, these rules permit that you record anything that you directly
>told the user or the user directly told you.  Referer record are
>troubling, I agree.
>
>>  I think we should specifically
>> constrain referral data alone (as provided in URI, Referer, or Origin)
>> and have the constraint be about operational use limits rather than
>> collection, and that retention be limited to operational needs.
>
>OK.  I agree that 'referer' needs consideration.
>
>> 
>>> If the data is held by another party on behalf of the identified
>>>party, that holding party MUST have no rights to use the data.
>> 
>> Too much partying.
>> 
>> Data collection may be contracted to some entity other than
>> the site operator (just as site operation may be contracted to some
>> entity other than the domain owner).  Such outsourced operations
>> are considered to be the *same site* if the data collected is siloed
>> to that site, is controlled by the same entity that controls the site,
>> and the data processor acting as that site's agent is contractually
>> bound to do all the things (as previously discussed under outsourcing)
>> that makes them a data processor and not a data controller.
>
>I think you're trying to read this in the context of the existing text,
>which has a 1st/3rd party distinction, whereas it is a strawman
>alternative.  I carefully wrote this so that out-sourcing is built-in.
>You may be right, that using EU terms would be better.
>
>> 
>>> Records derived when DNT is on (1), MUST be held separately from other
>>>data derived when DNT is not on (1).
>> 
>> That's not possible in general.
>> 
>> I'm not sure what that is trying to accomplish.  If it is just to
>> prevent re-identification, then simply "MUST NOT be combined with
>> records collected when DNT is not enabled" should be sufficient.
>
>That's a re-formulation of what I wrote, isn't it?
>
>> Or perhaps, reading on, what you mean is that "a site that retains
>> user-specific data MUST distinguish users with DNT enabled from
>> users with DNT not enabled, such that they are considered different
>> users and their associated records are never combined."  Note that
>> this will have an effect on user experience, though I think it
>> is a reasonable one.
>
>Yes.
>
>> Note that this is true, in general, of all the suggested definitions
>> for tracking.  We have no ability to prevent bad actors.  We can only
>> state the constraints and hope that regulators and journalists deal
>> with those that fail to adhere to the constraints.  If one of the
>> constraints is that a site MUST NOT correlate DNT-on data with DNT-off
>> data, then that is just as effective as a constraint that says a
>> third-party cannot collect the same data.  An evil party will,
>>regardless.
>
>
>Sure.  Since this is all about what happens in a database invisible to
>you, there is a certain amount of trust involved.  The evil can and will
>say "we comply" and keep building the database.  We can't protect
>ourselves from people who break the rules in secret.
>
>The concern is about organizations not intending to be evil, however, and
>the risk level.
>
>If nothing is recorded, then there can be no privacy leak of records,
>because there are no records.
>If per-user records are retained, but the records retained under DNT are
>'anonymous' and have no obvious link to the non-DNT records, then the
>risk is present but low.
>If per-user records are retained, but the DNT-on and DNT-off records
>merely kept separately but easily re-linked (e.g. by correlating IP
>addresses), the risk rises.
>Šand so on.
>
>I'm pointing out that under this formulation, I think the third case is
>where we land.
>
>David Singer
>Multimedia and Software Standards, Apple Inc.
>
>
Received on Saturday, 4 February 2012 19:13:27 UTC