RE: A First-Party List API for Site-Specific Exceptions (ISSUE-59, ISSUE-109, ISSUE-111, ISSUE-113, ISSUE-114) from Kevin Smith on 2012-03-15 (public-tracking@w3.org from March 2012)

From: Kevin Smith <kevsmith@adobe.com>
Date: Thu, 15 Mar 2012 09:18:48 -0700
To: Jonathan Mayer <jmayer@stanford.edu>, Sid Stamm <sid@mozilla.com>
CC: Tracking Protection Working Group WG <public-tracking@w3.org>
Message-ID: <6E120BECD1FFF142BC26B61F4D994CF3064CCAC489@nambx07.corp.adobe.com>
I was really looking for more concrete scenarios rather than a restatement of the business or privacy incentives.  For example, are there examples of static short ad serving chains or single 3rd party services that handle ad serving from beginning to end and do not redirect?  If so, what percentage of publishers use this type of service (enough that we want to complicate our docs to serve this niche?).  Real sites, or at least types of actual sites or business models would really help here.  

Because even if the ideas below make sense, are we implying that these publishers do not advertise?  If they do, then they would use a '*' to make the advertising chain function, and the points below are moot.

Regardless, there were some good thoughts below, so I did reply inline as well  prefaced with **- 

-----Original Message-----
From: Jonathan Mayer [mailto:jmayer@stanford.edu] 
Sent: Wednesday, March 14, 2012 4:15 PM
To: Sid Stamm
Cc: Kevin Smith; Tracking Protection Working Group WG
Subject: Re: A First-Party List API for Site-Specific Exceptions (ISSUE-59, ISSUE-109, ISSUE-111, ISSUE-113, ISSUE-114)

Here are a few concrete use cases:

1) A publisher wants to distinguish itself as having good privacy practices.  Instead of requesting a blanket exception, it wants to request exceptions for a small number of well-known third parties.
**I strongly disagree with the implication that allowing partial exceptions is a superior privacy practice.  It may offer some benefits which is why we are willing to explore it, but I do not see it as a way to differentiate on privacy practices.  
2) A publisher wants to reflow its layout to exclude widgets and SSO providers that the user has not excepted.  (This was Ian Fette's example.)
** This is valid, but I believe it would be rare because of the expense it serving such dynamic sites.  Besides, just because a 3rd party cannot track does not mean they cannot provide value.  
3) If certain web-wide exceptions are present, a publisher does not need to ask for new exceptions.
** Only valid if all of the 3rd parties on the site have web-wide exceptions, because when using an '*' you never have to ask more than once anyway.  So, a more concrete example would be, "a site on which the only 3rd party is a social network button".  In most cases, I would guess that the type of 3rd parties that are likely to gain web wide exceptions such as social networks, are less likely to be the ones that publishers would be asking exceptions for anyway because they will be tools which serve the user rather than the site (that's just speculation though).
4) A user wants to blacklist a third party in a manner that trumps a site-specific exception.
** this is a user scenario, not a site scenario.  This seems like the TPL discussion again.  If we decide we want to do TPLs then we clearly have to decide how they interact with exceptions.  However, this does not require partial exceptions.  We would have to decide whether blacklist trumped exception or vice versa, but regardless, for a 1st party wide exception, they would just get a DNT:1 instead of a DNT:0.  The decision tree is identical.

In all of these scenarios, a first party needs to know the exception status for specific third parties.  There are, broadly, four proposals for how we might facilitate that:

a) polling with existing web technology (e.g. cached JavaScript with a Vary: DNT response header),

b) polling with requestSiteSpecificTrackingException (asks for the exception if not already granted or denied),

c) polling with a dedicated API (e.g. testSiteSpecificTrackingException), and

d) getting a list from a dedicated API (e.g. siteSpecificTrackingExceptions).

Polling approaches, especially (a), will be slower.  A list approach brings greater first-party fingerprinting risk.  (With polling, a first party would have to enumerate domains; that's more difficult, slower, more easily detected, and can be mitigated through frequency capping.)  First-party fingerprinting risk in (c) and (d) could be greatly mitigated by requiring user consent for the API (e.g. "This website would like to learn about your tracking preferences for other websites.  Allow/Deny").

** I still maintain that polling is a completely unacceptable approach because
1) The first party has to wait until the all of the ads and redirects and 3rd parties have returned (or returned their status which wont work either) before they know what to do.  This is too late.  The first party needs to know on the initial request.

2) It just won't work.  The ad chains are dynamic.  Each step uses business logic to determine the next step.  You cannot ping the next step in the chain and say 'what is your DNT status and who is the next service on the list'.  It does not work that way.  These services have to have all of the information they will get in a normal request to process it and determine the next service in line.  So again a) you cannot get the list to ask for exceptions, b) that list may change per ad so you would always be popping up new exception requests, and c) again you would have to wait for the chain to return before you knew who was in the chain and how they would react to a DNT status.

On Mar 14, 2012, at 11:49 AM, Sid Stamm wrote:

> I agree with Kevin that we should identify precisely the set of use 
> cases we're looking to address by resurrecting this idea.
> 
> But I still have fingerprinting concerns with the proposal (please 
> help me understand if I'm wrong).  Even if it's limited to 
> first-party-context scripts, there's still the ability for first-party 
> sites to fingerprint using this list-based thing.
> 
> The adversaries in my fingerprinting concern are not the sites trying 
> to do the right thing with the suite of tools we're developing here, 
> but rather *other* sites that have no intention of being friendly.  I 
> know our work here is not intended to address "bad actors", but we 
> should not develop new technologies that make it easier for bad actors to act badly.
> 
> I'm worried about shadyfirstpartysite.com re-identifying me based on 
> things like my browsing history between visits to their site.  I'm 
> concerned that a first-party list-based API will provide 
> shadyfirstpartysite.com with more bits of entropy that make it easier 
> to re-identify me across multiple visits to their site.
> 
> I'm sure you will argue that it's already trivial to fingerprint users 
> if you're a "bad actor", and I concede that point, but if we're going 
> to make it easier (or a more statistically-strong confidence), we 
> better have some really good use cases that we cannot get from other, 
> similar features that don't carry the same fingerprinting "enhancement".
> 
> -Sid
> 
> On 3/14/12 11:38 AM, Kevin Smith wrote:
>> Before we resurrect this topic, can we please address the fact that 
>> it would only work in a very minimal set of use cases and make sure 
>> it is worth our time.
>> 
>> This API only has value if we allow partial exceptions (to some but 
>> not all 3rd parties on a site).  As I have stated in other threads, 
>> there are many many problems with allowing partial exceptions.  There 
>> are also real privacy and business incentives to do so, and so it may 
>> be worth trying to work through some of the problems.  However, there 
>> is a single core problem that would need to be solved prior to 
>> considering any of the additional work which is:  How do you request 
>> exceptions for a potentially ever changing, dynamic, unknown list of 
>> 3rd parties which make up common advertising chains.
>> 
>> The cnn example below simply does not work.  It is worthless to grant 
>> doubleclick (more accurately DFP) an exception because doubleclick 
>> usually does not serve the ads.  The publisher's ad server's 
>> responsibility is to find the ads which will yield the highest CPM 
>> for the publisher.  The process of doing so may involve several 
>> different services and redirects, and these services may change per 
>> user, per page or even per time of day.  Granting doubleclick the 
>> ability to track does not grant them the ability to perform their 
>> task unless the other steps in the chain also have an exception.
>> 
>> I am sure there are simple scenarios in which the 1st party will know 
>> all 3rd parties and may be willing to offer variable content based on 
>> which 3rd parties have exceptions.  I think before we talk about 
>> changing the functionality we should list some of these scenarios and 
>> determine whether it is worth our time or the browser's time to 
>> implement the API.
>> 
>> On a side note - if we can satisfactorily answer the above questions 
>> and feel that we do want to move forward with it, I agree with 
>> Jonathon that we can do so in a way which does not carry strong 
>> fingerprinting risks.  I think Jonathon's proposal would work.
>> 
>> -kevin
>> 
>> -----Original Message----- From: Jonathan Mayer 
>> [mailto:jmayer@stanford.edu] Sent: Wednesday, March 14, 2012 11:42 AM 
>> To: Tracking Protection Working Group WG Subject: A First-Party List 
>> API for Site-Specific Exceptions (ISSUE-59, ISSUE-109, ISSUE-111, 
>> ISSUE-113, ISSUE-114)
>> 
>> There have been renewed expressions of interest in a first-party API 
>> that lists which third parties have an exception.  Rationales include 
>> making it easy and fast for a first party to learn about exceptions 
>> (relative to alternatives that use existing web technology) and 
>> facilitating non-blanket exceptions and possibly third-party 
>> blacklisting by users.
>> 
>> The proposed siteSpecificTrackingExceptions API did just this; it was 
>> removed out of concern for third-party fingerprinting.  I'd like to 
>> bring back the proposal, but limit access to script running in the 
>> context of a top-level (first-party) origin.  Browsers already have 
>> the necessary building blocks; they know the top-level origin and the 
>> context origin associated with script execution.
>> 
>> Here's a concrete example: A CNN visitor has granted a web-wide 
>> exception to Google ("*", "doubleclick.net"), but nobody else.  When 
>> a cnn.com script calls navigator.siteSpecificTrackingExceptions, it 
>> gets back ["doubleclick.net"].  If a yahoo.com script calls 
>> navigator.siteSpecificTrackingExceptions, it gets no return value, 
>> and an error is thrown.
>> 
>> I don't see much marginal privacy risk to including this API.  Third 
>> parties already have countless stateful and stateless ways of 
>> tracking a browser, including with scripts running in the context of 
>> both their own and first-party origins.  This is just another API for 
>> compliance monitors to keep tabs on.
>> 
>> As for a first party handing a fingerprintable exception list to a 
>> third party, the same reasoning applies.  A first party could just as 
>> easily pass a unique identifier (e.g. a first-party ID cookie).  We 
>> should be clear in the document that the usual limits on first 
>> parties handing data to third parties apply.
>> 
>> 
>>
Received on Thursday, 15 March 2012 16:20:01 UTC