W3C home > Mailing lists > Public > public-tracking@w3.org > January 2012

Re: cross-site tracking and what it means

From: David Wainberg <dwainberg@appnexus.com>
Date: Thu, 19 Jan 2012 10:26:18 -0500
Message-ID: <4F18361A.7090800@appnexus.com>
To: David Singer <singer@apple.com>
CC: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Thanks, David, for giving this thoughtful consideration. Kevin's already 
provided a helpful response, but I'll throw in my two cents, as well.

On 1/18/12 8:01 PM, David Singer wrote:
> As I understand it, you're saying that
> * the sites I visit can remember anything about the nature and content of the visits I make to them (currently described as 1st party)
Yes, and I think that accurately describes the direction the group has 
been headed. The data is not cross-site (assuming it is not combined 
with data from other sites) so is not restricted. In this case, the 
cross-site model aligns well with the simplest 1st party case.
> * the sites that those sites 'pull in' (3rd parties, in our current terms) can remember
>    + NOT ONLY the fact that I pulled content from them, and that it was me
>    + BUT ALSO that it was because of visits to various other, ("1st party") sites ('he visited cnn.com and we showed him a book ad; bbc.com and we showed a soap ad')
> As far as I can tell, you seem to propose that the 3rd parties can collect all the same data as today, with the sole exception that the records have an extra tag on them -- whether they were collected under DNT or not -- and that the records collected under DNT have to be segregated and not correlated with the others.
I think you're describing cross-site data collection. If data collected 
from cnn.com is used to show an ad on bbc.com, that's cross-site. 
However, if data collected on cnn.com is used solely for serving ads on 
cnn.com, it is not cross-site. This is similar to the discussion we've 
already had about outsourced 3rd parties having 1st party rights under 
certain circumstances. But to me it's clearer because it focuses on the 
data rather than on the relationships between the parties.
> My problems are
> *  this is a usage restriction which is easily (accidentally or deliberately) dropped. The correlation and aggregation could happen at any time.
Same problem, essentially, as with the outsourcing exception we've 
already discussed. We can lay out some guidelines for how this should be 
done, but some point we have no choice but to trust people to obey the 
rules. The good actors will do their best to comply. The bad actors will 
find ways to break the rules regardless of what we put in this standard.
> *  I believe that 3rd parties remembering which 1st parties I chose to visit is, prima facie, cross-site, and should be excluded, not permitted.
We haven't defined "tracking" or "cross-site" tracking, but to me the 
definition would include some notion of the retention and compilation or 
correlation of the data. As we've discussed related to an outsourcing 
exception, if proper measures are taken to ensure data collected on site 
A is not compiled/correlated/used with data from site B and is only used 
for site A, then it's not cross-site.
> *  this is very close to a previous idea, that DNT didn't control tracking at all, just the presentation of behavioral advertising; the same database was being built, just the symptoms hidden from the users.
This is not what I intended. Again, we haven't defined tracking, but 
it's probably fair to assume that it includes collection and not just 
use. But we've suffered from the same problem in this group; we've 
focused primarily on usage and not collection. What I mean by that is 
that the present paradigm is to restrict all collection and use, and 
then to create a series of usage exceptions, without, as far as I 
understand, considering the actual nature of the collection and 
retention of data. I believe that by focusing on the cross-site notion, 
it will focus us more on the actual data collection and retention, 
rather than just the use.
> Now, I may have misunderstood.  But if I haven't, this doesn't address my concern as a consumer: I do not want organizations I did not choose to interact with, and whose very identity is usually hidden from me, building databases about me. That's tracking.  I don't think this meets "treat me as someone about whom you know nothing and remember nothing".
I don't think DNT means "treat me as someone about whom you know nothing 
and remember nothing." I'm fairly certain there is widespread agreement 
that all parties will be able to remember something for some purposes 
under DNT.
> If we were to say that *every* site, under DNT must not remember anything about my interaction with any other site than itself (and that rules out 3rd parties keeping records that identify the 1st party, as well), that *might* get closer.  Now the advertising site can do frequency capping (it remembers what ads it previously showed me) but not behavioral tracking (it does not remember I visited CNN, BBC and Amazon, and does not remember what I read or bought on those sites).  But this needs a lot of working through, and I am not hopeful it actually comes out simpler than the 1st/3rd distinction.
This raises very interesting questions about the definition of tracking. 
Note that it focuses on the collection and retention of certain types of 
data, rather than on the uses.
> On Jan 17, 2012, at 8:22 , David Wainberg wrote:
Received on Thursday, 19 January 2012 15:26:43 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:38:30 UTC