Re: ISSUE-111 - Exceptions are broken from Jonathan Mayer on 2012-03-09 (public-tracking@w3.org from March 2012)

From: Jonathan Mayer <jmayer@stanford.edu>
Date: Thu, 8 Mar 2012 18:22:36 -0800
To: Kevin Smith <kevsmith@adobe.com>
Cc: Nicholas Doty <npdoty@w3.org>, "VINCENT (VINCENT) TOUBIANA" <Vincent.Toubiana@alcatel-lucent.com>, Sid Stamm <sid@mozilla.com>, Tracking Protection Working Group WG <public-tracking@w3.org>
Message-Id: <1E1B7659-F61F-4BE8-B173-2BDB10AB35A3@stanford.edu>
I think there are now at least four different issues that have been raised on this thread.  Let me try to pull them apart.

1) When there are multiple layers of embedding and redirection, how can the browser accurately tell a third party its site-specific exception status?

Browsers already know the top window's domain (top.location.hostname) associated with an HTTP request or resource load.  They have to - otherwise third-party cookie blocking wouldn't work.  I see no problem here.

2) When there are multiple layers of embedding and redirection, how will a first party or third party know the status of a specific third party's site-specific exception?

A third party could easily build a means of sharing its exception status with any of the myriad mechanisms for cross-domain messaging.  We could also provide an explicit API.  (The last proposal of this sort was removed from the TPE draft owing to fingerprinting concerns.)  Again, I see no problem here.

3) When there are multiple layers of embedding and redirection, how will a third party know the first party's domain?

The same way a nested third party currently learns a top-level URL - by cross-domain messaging (most frequently in a request parameter).  A first party could also provide a postMessage-based mechanism for querying its location.  We could develop a new access control API for explicitly allowing reads of top.location.  (I don't think we should - that's unnecessary complexity.)  Yet again, I see no problem here.

4) How can third parties silo data based on exception status?

The same-origin policy makes it easy to silo data that's stored in the browser.  If there's a site-specific exception, you could use a site-specific domain (e.g. cnn.com.doubleclick.net).  If there's a web-wide exception, you could use a web-wide domain (e.g. tracking-allowed.doubleclick.net).  If there's no exception, use a no-tracking domain (e.g. tracking-disallowed.doubleclick.net).  If there's data that isn't pseudonymously linkable (e.g. a language cookie), that could get sent to all domains (e.g. .doubleclick.net).  I flatly reject Sean's view that letting the browser handle siloing would result in "truly crazy behavior that is non-implementable for servers."  Again, I see no problem here.

I also want to make a quick response to Kevin's point on monetization.

> * With DNT:1, the ad cannot be a targeted ad so the publisher's ad server chooses to go to a completely different ad network and shows a completely random ad for which the publisher is paid $y.
> * $y is much smaller than $x (obviously the publisher makes more money when it shows a targeted ad than when it shows a random ad)

This is a common misconception I've heard about the online advertising market and how it would be affected by DNT.

If a user has DNT enabled, an ad need not be (and most often won't be) random.  I believe there is a consensus in the group that contextual targeting, demographic targeting, and geographic targeting (possibly with some degree of coarseness) would all be allowed.

As for economic impact, it is far from clear that behavioral targeting brings in substantial marginal revenue for publishers.  See http://cyberlaw.stanford.edu/node/6592 and http://donottrack.us/bib/#sec_economics.

On Mar 8, 2012, at 5:18 PM, Kevin Smith wrote:

> Excellent point Nick.  I think you're right.  The browser will have to know because no matter how many redirects it goes through, eventually it has to put the content in the right place in a DOM somewhere (I was thinking of what would be available to each stop in the chain rather than what the browser would know when it made the request to each stop in the chain).  So with an '*' for the 1st party site, the browser should be able to send the correct header to each stop in the chain.  Partial crisis averted.
> 
> However, I don't believe advertising chains could ever function in a scenario where each 3rd party could be approved individually by the user.  Since the chain is so dynamic, and the 1st party (or even most elements in the chain) do not know what services will be used by the time you get to the end of the chain, exceptions for these items could never by requested.
> 
> -----Original Message-----
> From: Nicholas Doty [mailto:npdoty@w3.org] 
> Sent: Thursday, March 08, 2012 4:54 PM
> To: Kevin Smith
> Cc: VINCENT (VINCENT) TOUBIANA; Sid Stamm; Tracking Protection Working Group WG
> Subject: Re: ISSUE-111 - Exceptions are broken
> 
> On Mar 8, 2012, at 2:17 PM, Kevin Smith wrote:
> 
>>> As I understand it, an exception for "*" on a first-party site would imply that the user agent would send DNT:0 to every domain from which a resource was requested as part of loading the first-party page (including subsequent re-directs, iframes and XHR requests).
>> 
>> I am not sure how to do this using current methodologies.  Take a simple example.  Site A has an exception for all 3rd parties and includes 3rd Party B which then includes 3rd Party C.  3rd Party C is requested from 3rd Party B, not Site A.  How does the browser know that 3rd Party C's request originated from Site A?  Certainly 3rd Part C probably knows from customized request parameters, but how does the browser map the request to its list of exceptions to even see the '*' associated with site A?  I think this would be new functionality.
> 
> We opened ISSUE-110 I believe specifically for this question (will user agents always be able to determine corresponding top-level-origin for all outgoing requests?) as Vincent was concerned that browsers or browser extensions might not be able to do this. I believe Sid informed us that browsers always could (whether it's a redirect, embedded iframe, XHR request, etc.) which is why it was closed -- Sid, can you confirm?
> 
> It would seem to me that browsers could always determine what site (or browser tab, say) has initiated a request: when I close a browser window, the browser knows which requests to stop making. When the browser receives a response to an HTTP request, it knows which DOM gets the corresponding JavaScript events or which frame to load the parsed page into.
> 
> Thanks,
> Nick
>
Received on Friday, 9 March 2012 02:23:04 UTC