RE: cross-site tracking and what it means

That's not exactly what I was suggesting.  I look forward to next week when we can explore these options in person with a whiteboard.  Hopefully we can make a lot of progress.

What I am proposing is that if a user has DNT turned on when visiting a given website both 1st and 3rd parties are allowed to record a visitor's usage on that site as long as it is only connected, stored, used (etc etc) with that website.  So, the 3rd party would know that you visited a 1st party, but would not know that you had ever visited another 1st party site.  It is not simply another tag on the data, they must actually store the data under separate visitor ids so that they cannot tell you are the same visitor -- ie they CANNOT stitch your profile together.

Example
* A person visits Site A and Site B with DNT turned ON.  
* Both Site A and Site B call out to Example3rdParty.com.
* When the person visits Site A, Example3rdParty.com assigns them a visitorID of 101.  All profile data that is collected on Site A for this visitor is attached to visitorID 101.
* When the person visits site B, Example3rdParty.com assigns them a visitorID of 102 and all data it collects on that site is only associated with visitorID 102.
* Example3rdParty.com does not know that visitorID 102 is the same person as visitorID 101 (at least not on server-side) and so cannot aggregate the data at a later time.

This is essentially how 1st Party Outsourcing behaves under our current definitions.  To address your 3 specified concerns:

>My problems are
>*  this is a usage restriction which is easily (accidentally or deliberately) dropped. The correlation and aggregation could happen at any time.

This is a valid concern, but I do not think it's exacerbated by this approach.  If data is correctly siloed, data should not be able to accidentally be correctly aggregated.  And I think all approaches are susceptible to deceptive behavior.  

> *  I believe that 3rd parties remembering which 1st parties I chose to visit is, prima facie, cross-site, and should be excluded, not permitted.

This does allow a 3rd party to know that you visited A 1st party, but not multiple 1st parties.  And since they can only use that data ON that 1st party site, it does not seem like Cross Site tracking to me.  Again, see 1st Party Outsourcing.

> *  this is very close to a previous idea, that DNT didn't control tracking at all, just the presentation of behavioral advertising; the same database was being built, just the symptoms hidden from the users.

I don't think this is accurate.  Collection, storage and usage would be regulated.  The database would not be the same.  It may have similar raw data, but it would be missing all of the aggregated, correlated data.

Hopefully that makes sense. 





-----Original Message-----
From: David Singer [mailto:singer@apple.com] 
Sent: Wednesday, January 18, 2012 6:01 PM
To: public-tracking@w3.org (public-tracking@w3.org)
Subject: cross-site tracking and what it means

David, Kevin, thanks

I read through this and some other background material.

I share the unease about the difficulty of defining 1st and 3rd parties, and would love to find a way to eliminate that distinction and apply uniform rules.  But, if I understand it correctly, what you and Kevin are saying is not, I think, satisfactory.  But I may mis-understand.  Let me work through it, in case I am off base.

As I understand it, you're saying that 
* the sites I visit can remember anything about the nature and content of the visits I make to them (currently described as 1st party)
* the sites that those sites 'pull in' (3rd parties, in our current terms) can remember 
  + NOT ONLY the fact that I pulled content from them, and that it was me
  + BUT ALSO that it was because of visits to various other, ("1st party") sites ('he visited cnn.com and we showed him a book ad; bbc.com and we showed a soap ad')

As far as I can tell, you seem to propose that the 3rd parties can collect all the same data as today, with the sole exception that the records have an extra tag on them -- whether they were collected under DNT or not -- and that the records collected under DNT have to be segregated and not correlated with the others.  

My problems are
*  this is a usage restriction which is easily (accidentally or deliberately) dropped. The correlation and aggregation could happen at any time.
*  I believe that 3rd parties remembering which 1st parties I chose to visit is, prima facie, cross-site, and should be excluded, not permitted.
*  this is very close to a previous idea, that DNT didn't control tracking at all, just the presentation of behavioral advertising; the same database was being built, just the symptoms hidden from the users.

Now, I may have misunderstood.  But if I haven't, this doesn't address my concern as a consumer: I do not want organizations I did not choose to interact with, and whose very identity is usually hidden from me, building databases about me. That's tracking.  I don't think this meets "treat me as someone about whom you know nothing and remember nothing".

If we were to say that *every* site, under DNT must not remember anything about my interaction with any other site than itself (and that rules out 3rd parties keeping records that identify the 1st party, as well), that *might* get closer.  Now the advertising site can do frequency capping (it remembers what ads it previously showed me) but not behavioral tracking (it does not remember I visited CNN, BBC and Amazon, and does not remember what I read or bought on those sites).  But this needs a lot of working through, and I am not hopeful it actually comes out simpler than the 1st/3rd distinction.

On Jan 17, 2012, at 8:22 , David Wainberg wrote:

> Kevin circulated some great materials and discussion on this back in December: http://lists.w3.org/Archives/Public/public-tracking/2011Dec/0051.html and http://lists.w3.org/Archives/Public/public-tracking/2011Dec/0127.html.
> 
> But I'm happy to take a stab at explaining how I see it.
> 
> In defining 1st vs 3rd, and saying DNT doesn't, for the most part, apply to 1st parties, are we saying that 1st parties have an exception to engage in [cross-site] tracking, or are we saying 1st party data collection, by definition, is not [cross-site] tracking? There seems to be, if not consensus, at least widespread agreement that the concern of this standard (the "Do Not" of DNT) is something along the lines of the collection and accumulation of data about internet users' web browsing history across (unrelated | unaffiliated | non-commonly branded | ??)  websites. I don't think we mean that 1st parties are free to engage in [cross-site] tracking, but rather that once it's cross-site, it's no longer 1st party. There may be parties who have consent to track across sites by virtue of their 1st party relationship with the user, but is there such a thing as 1st party cross-site tracking? Let's assume we can acheive a defition of cross-site tracking, do you imagine 1st and 3rd parties would be treated differently under the standard? I don't imagine so, though 1st parties will have different opportunities for acquiring users' consent.
> 
> One might then think that the 1st/3rd party distinction and "cross-site" are equivalent. But I would argue they're not, for at least the following. First, defining cross-site tracking is closer to the problem we're trying to solve, and that's generally a good thing. By tailoring our definitions to the actual problems we are trying to solve, we reduce the risk of being overinclusive, creating ambiguity, or creating unintended consequences.
> 
> Additionally, although we will still need to define cross-site tracking, I think that's an easier problem to solve and will be easier for all parties to implement. Parties can be lots of things. It's impossible to account for all the different relationships between parties and users, and parties and parties, and so on. Cross-site tracking data is a much more constrained set, so will be that much easier to put a definition around.
> 
> By taking the cross-site approach, DNT becomes as simple as:
> 
> 1. Cross-site tracking = X
> 2. If DNT == 1, X may not be done, except:
>    a. with consent; or
>    b. for these purposes: [...]
> 
> Some of the benefits:
> - Relies simply on a clear definition of the data collection and use practices DNT is concerned with, rather than a multi-step process of determining party status and then covered collection and use.
> - Removes the step of determining 1st vs 3rd party status in any given circumstance, and then possibly having separate compliance paths for each.
> - Saves us from defining 1st vs 3rd parties, and thus eliminates having to deal with edge cases like widgets and URL shorteners.
> - Solves the 3rd party as agent problem: if it's not cross-site, it's not covered.
> 
> 
> 
> On 1/13/12 5:41 PM, David Singer wrote:
>> In reading a separate thread, I realized that there is a potential issue here over DNT:0.
>> 
>> A little while back we discussed whether the UA should send a DNT header to the first party.  A number of us argued that it should, even if the first party is exempt: because the first party may care that its third parties are being asked not to track - it might ask for payment in consequence, for example.
>> 
>> This argument relies on the assumption that DNT is a single 'big switch', either on or off, but the discussion around DNT:0 reveals that people think it may be OK for the UA to send DNT:1 to some sites, and DNT:0 to others.
>> 
>> So what, then, does the first party get?  DNT:1 if any third party is getting DNT:1, else DNT:0 if all are getting DNT:0?  An average of the DNT values :-) DNT:0.7 ??!
>> 
>> Am I, as a UA, allowed to mix non-DNT requests into the mix?
>> 
>> 
>> David Singer
>> Multimedia and Software Standards, Apple Inc.
>> 
>> 

David Singer
Multimedia and Software Standards, Apple Inc.

Received on Thursday, 19 January 2012 06:15:50 UTC