W3C home > Mailing lists > Public > public-tracking@w3.org > January 2012

Re: cross-site tracking and what it means

From: Jonathan Mayer <jmayer@stanford.edu>
Date: Fri, 20 Jan 2012 15:37:47 -0600
Cc: David Singer <singer@apple.com>, "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-Id: <187A72A9-2249-400D-BD94-CF593E81CF50@stanford.edu>
To: Shane Wiley <wileys@yahoo-inc.com>
Responses below.

Jonathan

On Jan 19, 2012, at 12:21 AM, Shane Wiley wrote:

> Jonathan,
> 
> I agree there is more discussion needed within each area I highlighted.  I should have been more explicit in calling out my goal was more to layout the "framework" that it feels we have arrived at rather than document all of the details in this brief pass.  If we agree on the basic framework, we can then narrow in on the details within each element to drive to consensus (which I believe the group has successfully done for the most part at the framework and mid-detail levels).
> 
> More detailed responses below in []...
> 
> - Shane
> 
> -----Original Message-----
> From: Jonathan Mayer [mailto:jmayer@stanford.edu] 
> Sent: Wednesday, January 18, 2012 5:56 PM
> To: Shane Wiley
> Cc: David Singer; public-tracking@w3.org (public-tracking@w3.org)
> Subject: Re: cross-site tracking and what it means
> 
> Some clarifications below.
> 
> On Jan 18, 2012, at 5:12 PM, Shane Wiley wrote:
> 
>> David,
>> 
>> We may be speaking past one another as I believe several other email chains highlighted consensus on the following points (I'm going to use 1st party / 3rd party language but could try to retool this as site owner and non-site owner <aka - I believe we have to define this no matter the language we use>):
>> 
>> - 1st party: Able to ignore DNT for site specific activities
>> - 1st party: When DNT:1 is present is unable to share information about that user with 3rd parties (service provider exception)
> 
> As I recall, the consensus in Santa Clara gave first parties a bit more freedom by not distinguishing between frontend and backend data sharing.  That is, what a first party may provide to a third party on the backend is coextensive with what a third party may retain/use from collection on the frontend.  The outsourcing exception is analytically independent of whether the third party collected data on the frontend or the backend.
> 
> [I didn't make a distinction between frontend/backend on the 3rd party sharing prohibition with DNT:1 - and would suggest it's the same outcome in either case...prohibited.  I would have thought you'd agree with that position…?]

I think you have a narrower conception of the term "sharing" than I do, so let me try shifting terminology.  My read of the group was that the standard would allow a first party to intentionally provide information to a third party only if the third party is allowed to collect that information directly from the browser itself.  If there's a view that these use cases are unrealistic given the restrictions we impose on third parties, or that condoning intentional (especially backend) information sharing is unwise, then it would make sense to establish a stricter limitation.

>> - 3rd party: When DNT:1 is present is unable to profile user activities on this site or leverage previously profiled information (aka - "cross site data")  [may still collect data for supported Operational Purposes]
> 
> The consensus analytical framework since at least Santa Clara has been a prohibition on all third-party information practices, followed by a series of narrowly-defined exceptions.  We have a list of candidate exceptions, and there has been some significant progress on a few of those (especially the outsourcing exception).  Working group conversations have strongly suggested a consensus that there will not be an exception for user profiling and use of a previously-generated user profile.  There is certainly not consensus for a high-level "operational purposes" exception as of yet.
> 
> [I'm not suggesting an exception for profiling - not sure how you read that - so we're agreed there.  Also, while I say "operational purpose exception" and you've said "narrowly-defined exceptions", we're saying the same thing.]

I'm not suggesting there was ever a chance of a profiling exception.  Rather, I was clarifying the analytical framework.  Your comment could be (mis)read to suggest a consensus that Do Not Track = Do Not Profile.

> Also, your use of "cross site data" here reifies the term's ambiguity.  You take it to mean something about profiling activity, which is - as I've read the thread - yet another interpretation on top of several others.
> 
> [I disagree and believe "cross-site" provides less ambiguity here versus a more generic use of "tracking".  I believe we'll continue to agree to disagree on this point.  This text was purposely brief to highlight framework elements - not the details - creating an obvious window to nit on the details.  :-) ]
> 
>> - 3rd party as a Service Provider:  Able to ignore DNT if service is for a 1st party and 3rd party has no independent rights to use this information across sites (legal and technical protections)
> 
> There was consensus at Santa Clara that the outsourcing exception requires both legal and technical precautions.  There was not consensus about what those precautions are against, either 1) collecting data that could be correlated across first parties, or 2) commingling data across first parties.  David's draft text proposes the former rule (which de facto encapsulates the latter rule), and I am in support.
> 
> [I also provided draft text here out of the Cambridge f2f (as did you).  As long as data is managed in such a manner to disallow cross-site comingling where DNT:1 exists, then the intended outcome is met (contractually/public commitments/technical safeguards).  Prescribing specific technologies to achieve the outcome is short-sighted so I would caution from moving in that direction.]

The "intended outcome" for me (and many others) includes providing users with control over a website having access to a sizable proportion of their browsing history.  Tom and I were explicit on this point in our non-normative discussion of the first party vs. third party distinction.  The approach you propose would allow a third party to continue collecting and retaining such data.  And unnecessarily, too - the third party would be barred from using the data.

I would note that this is all independent of whether we require a specific technical precaution.  While I maintain that the same-origin policy provides a privacy primitive that neatly maps to the problem, I was willing to compromise in Santa Clara.

>> - 1st party in a 3rd party context (Widgets):  1st party is present on a 3rd party site and would act as a 3rd party unless the user "meaningfully interacts" with a branded component which at that point they fall under 1st party rules (need to vet the use cases here in more detail but I believe the group has made good progress here)
> 
> I believe there is consensus on how the most common widget use cases should turn out.  There appeared to be consensus on the list and in calls to apply a user expectations test to borderline cases, but that consensus may no longer exist.
> 
> [I believe there is agreement on the common Widget outline and agree more work will be necessary to enumerate the details here (use cases, example, branding requirements, linking requirements, etc.) versus relying on a generic "user expectation" standard which is highly subjective and an impossible target to hit as many parties can argue a different end-point is "expected by users" (ask 3 sources and receive 3 different answers).]

I have no doubt that there will be lengthy discussion next week of whether a user expectations test is workable.

>> Are these general rules not agreed to by everyone?  I thought we had general consensus on this framework and were working through the edges of each of these and the associated definitions.
>> 
>> - Shane
>> 
>> -----Original Message-----
>> From: David Singer [mailto:singer@apple.com] 
>> Sent: Wednesday, January 18, 2012 5:01 PM
>> To: public-tracking@w3.org (public-tracking@w3.org)
>> Subject: cross-site tracking and what it means
>> 
>> David, Kevin, thanks
>> 
>> I read through this and some other background material.
>> 
>> I share the unease about the difficulty of defining 1st and 3rd parties, and would love to find a way to eliminate that distinction and apply uniform rules.  But, if I understand it correctly, what you and Kevin are saying is not, I think, satisfactory.  But I may mis-understand.  Let me work through it, in case I am off base.
>> 
>> As I understand it, you're saying that 
>> * the sites I visit can remember anything about the nature and content of the visits I make to them (currently described as 1st party)
>> * the sites that those sites 'pull in' (3rd parties, in our current terms) can remember 
>> + NOT ONLY the fact that I pulled content from them, and that it was me
>> + BUT ALSO that it was because of visits to various other, ("1st party") sites ('he visited cnn.com and we showed him a book ad; bbc.com and we showed a soap ad')
>> 
>> As far as I can tell, you seem to propose that the 3rd parties can collect all the same data as today, with the sole exception that the records have an extra tag on them -- whether they were collected under DNT or not -- and that the records collected under DNT have to be segregated and not correlated with the others.  
>> 
>> My problems are
>> *  this is a usage restriction which is easily (accidentally or deliberately) dropped. The correlation and aggregation could happen at any time.
>> *  I believe that 3rd parties remembering which 1st parties I chose to visit is, prima facie, cross-site, and should be excluded, not permitted.
>> *  this is very close to a previous idea, that DNT didn't control tracking at all, just the presentation of behavioral advertising; the same database was being built, just the symptoms hidden from the users.
>> 
>> Now, I may have misunderstood.  But if I haven't, this doesn't address my concern as a consumer: I do not want organizations I did not choose to interact with, and whose very identity is usually hidden from me, building databases about me. That's tracking.  I don't think this meets "treat me as someone about whom you know nothing and remember nothing".
>> 
>> If we were to say that *every* site, under DNT must not remember anything about my interaction with any other site than itself (and that rules out 3rd parties keeping records that identify the 1st party, as well), that *might* get closer.  Now the advertising site can do frequency capping (it remembers what ads it previously showed me) but not behavioral tracking (it does not remember I visited CNN, BBC and Amazon, and does not remember what I read or bought on those sites).  But this needs a lot of working through, and I am not hopeful it actually comes out simpler than the 1st/3rd distinction.
>> 
>> On Jan 17, 2012, at 8:22 , David Wainberg wrote:
>> 
>>> Kevin circulated some great materials and discussion on this back in December: http://lists.w3.org/Archives/Public/public-tracking/2011Dec/0051.html and http://lists.w3.org/Archives/Public/public-tracking/2011Dec/0127.html.
>>> 
>>> But I'm happy to take a stab at explaining how I see it.
>>> 
>>> In defining 1st vs 3rd, and saying DNT doesn't, for the most part, apply to 1st parties, are we saying that 1st parties have an exception to engage in [cross-site] tracking, or are we saying 1st party data collection, by definition, is not [cross-site] tracking? There seems to be, if not consensus, at least widespread agreement that the concern of this standard (the "Do Not" of DNT) is something along the lines of the collection and accumulation of data about internet users' web browsing history across (unrelated | unaffiliated | non-commonly branded | ??)  websites. I don't think we mean that 1st parties are free to engage in [cross-site] tracking, but rather that once it's cross-site, it's no longer 1st party. There may be parties who have consent to track across sites by virtue of their 1st party relationship with the user, but is there such a thing as 1st party cross-site tracking? Let's assume we can acheive a defition of cross-site tracking, do you imagine 1st and 3rd parties would be treated differently under the standard? I don't imagine so, though 1st parties will have different opportunities for acquiring users' consent.
>>> 
>>> One might then think that the 1st/3rd party distinction and "cross-site" are equivalent. But I would argue they're not, for at least the following. First, defining cross-site tracking is closer to the problem we're trying to solve, and that's generally a good thing. By tailoring our definitions to the actual problems we are trying to solve, we reduce the risk of being overinclusive, creating ambiguity, or creating unintended consequences.
>>> 
>>> Additionally, although we will still need to define cross-site tracking, I think that's an easier problem to solve and will be easier for all parties to implement. Parties can be lots of things. It's impossible to account for all the different relationships between parties and users, and parties and parties, and so on. Cross-site tracking data is a much more constrained set, so will be that much easier to put a definition around.
>>> 
>>> By taking the cross-site approach, DNT becomes as simple as:
>>> 
>>> 1. Cross-site tracking = X
>>> 2. If DNT == 1, X may not be done, except:
>>>  a. with consent; or
>>>  b. for these purposes: [...]
>>> 
>>> Some of the benefits:
>>> - Relies simply on a clear definition of the data collection and use practices DNT is concerned with, rather than a multi-step process of determining party status and then covered collection and use.
>>> - Removes the step of determining 1st vs 3rd party status in any given circumstance, and then possibly having separate compliance paths for each.
>>> - Saves us from defining 1st vs 3rd parties, and thus eliminates having to deal with edge cases like widgets and URL shorteners.
>>> - Solves the 3rd party as agent problem: if it's not cross-site, it's not covered.
>>> 
>>> 
>>> 
>>> On 1/13/12 5:41 PM, David Singer wrote:
>>>> In reading a separate thread, I realized that there is a potential issue here over DNT:0.
>>>> 
>>>> A little while back we discussed whether the UA should send a DNT header to the first party.  A number of us argued that it should, even if the first party is exempt: because the first party may care that its third parties are being asked not to track - it might ask for payment in consequence, for example.
>>>> 
>>>> This argument relies on the assumption that DNT is a single 'big switch', either on or off, but the discussion around DNT:0 reveals that people think it may be OK for the UA to send DNT:1 to some sites, and DNT:0 to others.
>>>> 
>>>> So what, then, does the first party get?  DNT:1 if any third party is getting DNT:1, else DNT:0 if all are getting DNT:0?  An average of the DNT values :-) DNT:0.7 ??!
>>>> 
>>>> Am I, as a UA, allowed to mix non-DNT requests into the mix?
>>>> 
>>>> 
>>>> David Singer
>>>> Multimedia and Software Standards, Apple Inc.
>>>> 
>>>> 
>> 
>> David Singer
>> Multimedia and Software Standards, Apple Inc.
>> 
>> 
>> 
> 
Received on Friday, 20 January 2012 21:47:35 UTC

This archive was generated by hypermail 2.3.1 : Friday, 21 June 2013 10:11:23 UTC