Re: cross-site tracking and what it means from Jonathan Robert Mayer on 2012-01-21 (public-tracking@w3.org from January 2012)

From: Jonathan Robert Mayer <jmayer@stanford.edu>
Date: Fri, 20 Jan 2012 18:18:37 -0800 (PST)
To: David Wainberg <dwainberg@appnexus.com>
Cc: Kevin Smith <kevsmith@adobe.com>, David Singer <singer@apple.com>, "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-Id: <09548BCF-33FE-4406-B86E-34BE219C03BE@stanford.edu>

On Jan 20, 2012, at 5:23 PM, David Wainberg <dwainberg@appnexus.com> wrote:

> 
> 
> On 1/20/12 3:34 PM, Jonathan Mayer wrote:
>> This is the clearest articulation I've seen of what "cross-site tracking" might mean.  Thanks, Kevin.
>> 
>> I would offer three criticisms of the approach.
>> 
>> First, it does nothing to simplify definitions: it requires defining what qualifies for a silo (= party), and it requires defining which silo is applicable in a given context (= first party vs. third party).  In fact, it is trivial to recast the proposal into our current analytical approach: an exception for all data that is siloed per-first party.
> I disagree. I think it requires only defining "cross-site" and "cross-site tracking". Once data collected across sites is combined, it becomes "cross-site". This makes it very simple. I understand there will be some interest in guidelines around the adequate segregation of the data.

You've left out the definition of "site," which in your meaning subsumes the "party" and "first party" components of our current approach.

I think the most productive way forward is to stop having this conversation in the abstract. We've seen how the current analytical framework operates; I'd ask "cross-site tracking" proponents to prepare detailed analysis of a few use cases for Brussels.

>> Second, as Rigo and David note, the approach relies far too extensively on siloing.  There are myriad effective ways of linking user records that do not share an identifier.  (See all the research my lab and others have done on re-identification and how third parties can identify a user.)  While I'm not overly comfortable with the extent to which the outsourcing exception relies on siloing, at least outsourced services have, in general, greater market incentives to 1) silo anyways, 2) not game silos, and 3) get security right.  Moreover, if an outsourced service does goof on its privacy or security, it may not only lose clients, but it may also face litigation from former clients.
> We cannot solve this whole problem with DNT. Bad actors will do bad things, regardless of DNT. But one thing we can do with DNT is to create incentives for minimizing data collected and retained. Again, this is a reason to focus more on data than on usage.

I don't follow. My very criticism was that siloing - especially usage-based siloing - isn't enough.

>> Third, it does not go far enough in addressing consumer privacy risks.  In our proposed non-normative discussion of first vs. third parties, Tom and I identified three motivations for the distinction: user awareness and control of information sharing, market incentives for privacy and security, and collection of data across unrelated websites.  The "cross-site tracking" approach only somewhat mitigates the third concern and does nothing to address the first two.
> Can you explain this further? Why does "cross-site tracking" not solve these problems?

Even assuming data could be perfectly or near-perfectly siloed per-first party - and it can't - per-first party data can be quite revealing (and identifiable). For example: Your reading history on a newspaper site or your queries on a health community site. (My lab has seen identifying information leakage from websites in both of these categories, by the way.)

> 
>>

Received on Saturday, 21 January 2012 02:19:06 UTC