RE: ISSUE-111 - Exceptions are broken from Kevin Smith on 2012-03-09 (public-tracking@w3.org from March 2012)

From: Kevin Smith <kevsmith@adobe.com>
Date: Fri, 9 Mar 2012 14:08:22 -0800
To: Jonathan Mayer <jmayer@stanford.edu>
CC: Nicholas Doty <npdoty@w3.org>, "VINCENT (VINCENT) TOUBIANA" <Vincent.Toubiana@alcatel-lucent.com>, Sid Stamm <sid@mozilla.com>, Tracking Protection Working Group WG <public-tracking@w3.org>
Message-ID: <6E120BECD1FFF142BC26B61F4D994CF3064CCABB85@nambx07.corp.adobe.com>
Jonathon.  Thanks for splitting these out.  Responses inline in red.

From: Jonathan Mayer [mailto:jmayer@stanford.edu]
Sent: Thursday, March 08, 2012 7:23 PM
To: Kevin Smith
Cc: Nicholas Doty; VINCENT (VINCENT) TOUBIANA; Sid Stamm; Tracking Protection Working Group WG
Subject: Re: ISSUE-111 - Exceptions are broken

I think there are now at least four different issues that have been raised on this thread.  Let me try to pull them apart.

1) When there are multiple layers of embedding and redirection, how can the browser accurately tell a third party its site-specific exception status?

Browsers already know the top window's domain (top.location.hostname) associated with an HTTP request or resource load.  They have to - otherwise third-party cookie blocking wouldn't work.  I see no problem here.

Agreed

2) When there are multiple layers of embedding and redirection, how will a first party or third party know the status of a specific third party's site-specific exception?

A third party could easily build a means of sharing its exception status with any of the myriad mechanisms for cross-domain messaging.  We could also provide an explicit API.  (The last proposal of this sort was removed from the TPE draft owing to fingerprinting concerns.)  Again, I see no problem here.

* I could not disagree more.  First, the words "A third party could easily build ..." throws up a ton of red flags.  In a case like this, that is virtually synonymous with "this will not get used much".  Especially in this case where every member of a very dynamic chain of 3rd parties would all have to implement it, and the 1st party would also have to make significant changes to take advantage of it ... all for a fairly minimal gain.

* Plus, assuming that all necessary parties actually did implement this, it still would be inadequate because the 1st party needs to know ahead of time whether they can monetize the visitor.  You cannot expect the 1st party to wait until their ads have actually returned or not to make decide how to monetize a visitor.  We could possibly build an API that could query a single 3rd party to determine its exception status, but we could not build one that would traverse a very dynamic 3rd party chain to do so.  Since each element of the chain uses its own business logic to determine what the next step in the chain may be, you could not fake it with a minimal header exchange.

3) When there are multiple layers of embedding and redirection, how will a third party know the first party's domain?

The same way a nested third party currently learns a top-level URL - by cross-domain messaging (most frequently in a request parameter).  A first party could also provide a postMessage-based mechanism for querying its location.  We could develop a new access control API for explicitly allowing reads of top.location.  (I don't think we should - that's unnecessary complexity.)  Yet again, I see no problem here.

* From a DNT perspective, I actually do not think the 3rd party needs to know the top level domain.  They probably do anyway, as you said, from query parameters, but that would take things out of the browser's control.  But for our purposes, I don't think it would be required.  However, if it were required, a client side API would be fairly worthless since most stops in the chain never output anything to the client - they provide their value and then redirect.  The top level domain would have to come via headers or request parameters for them to make their decisions server-side.  Fortunately, I don't think this is an issue.

4) How can third parties silo data based on exception status?

The same-origin policy makes it easy to silo data that's stored in the browser.  If there's a site-specific exception, you could use a site-specific domain (e.g. cnn.com.doubleclick.net<http://cnn.com.doubleclick.net>).  If there's a web-wide exception, you could use a web-wide domain (e.g. tracking-allowed.doubleclick.net<http://tracking-allowed.doubleclick.net>).  If there's no exception, use a no-tracking domain (e.g. tracking-disallowed.doubleclick.net<http://tracking-disallowed.doubleclick.net>).  If there's data that isn't pseudonymously linkable (e.g. a language cookie), that could get sent to all domains (e.g. .doubleclick.net).  I flatly reject Sean's view that letting the browser handle siloing would result in "truly crazy behavior that is non-implementable for servers."  Again, I see no problem here.

* Again, I do not think siloing is required for exceptions.  When you grant a 1st party/3rd party combo an exception, it does not mean that the 3rd party can only use data from that particular 1st party to provide its service.  That exception would have minimal to know value for many 3rd parties and consequently would have minimal value to the 1st parties.  And if this were a requirement, then the statement #3 above does because a problem.

I also want to make a quick response to Kevin's point on monetization.

* With DNT:1, the ad cannot be a targeted ad so the publisher's ad server chooses to go to a completely different ad network and shows a completely random ad for which the publisher is paid $y.
* $y is much smaller than $x (obviously the publisher makes more money when it shows a targeted ad than when it shows a random ad)

This is a common misconception I've heard about the online advertising market and how it would be affected by DNT.

If a user has DNT enabled, an ad need not be (and most often won't be) random.  I believe there is a consensus in the group that contextual targeting, demographic targeting, and geographic targeting (possibly with some degree of coarseness) would all be allowed.

As for economic impact, it is far from clear that behavioral targeting brings in substantial marginal revenue for publishers.  See http://cyberlaw.stanford.edu/node/6592 and http://donottrack.us/bib/#sec_economics.

* Agreed.  If DNT allows for contextual targeting, it need not necessarily be random or a minimal CPM.  However, publishers that currently focus on maximizing behaviorally targeted advertising are certainly doing so under the belief that $y is smaller than $x, so for those cases, the argument above still holds true.

On Mar 8, 2012, at 5:18 PM, Kevin Smith wrote:


Excellent point Nick.  I think you're right.  The browser will have to know because no matter how many redirects it goes through, eventually it has to put the content in the right place in a DOM somewhere (I was thinking of what would be available to each stop in the chain rather than what the browser would know when it made the request to each stop in the chain).  So with an '*' for the 1st party site, the browser should be able to send the correct header to each stop in the chain.  Partial crisis averted.

However, I don't believe advertising chains could ever function in a scenario where each 3rd party could be approved individually by the user.  Since the chain is so dynamic, and the 1st party (or even most elements in the chain) do not know what services will be used by the time you get to the end of the chain, exceptions for these items could never by requested.

-----Original Message-----
From: Nicholas Doty [mailto:npdoty@w3.org]
Sent: Thursday, March 08, 2012 4:54 PM
To: Kevin Smith
Cc: VINCENT (VINCENT) TOUBIANA; Sid Stamm; Tracking Protection Working Group WG
Subject: Re: ISSUE-111 - Exceptions are broken

On Mar 8, 2012, at 2:17 PM, Kevin Smith wrote:


As I understand it, an exception for "*" on a first-party site would imply that the user agent would send DNT:0 to every domain from which a resource was requested as part of loading the first-party page (including subsequent re-directs, iframes and XHR requests).

I am not sure how to do this using current methodologies.  Take a simple example.  Site A has an exception for all 3rd parties and includes 3rd Party B which then includes 3rd Party C.  3rd Party C is requested from 3rd Party B, not Site A.  How does the browser know that 3rd Party C's request originated from Site A?  Certainly 3rd Part C probably knows from customized request parameters, but how does the browser map the request to its list of exceptions to even see the '*' associated with site A?  I think this would be new functionality.

We opened ISSUE-110 I believe specifically for this question (will user agents always be able to determine corresponding top-level-origin for all outgoing requests?) as Vincent was concerned that browsers or browser extensions might not be able to do this. I believe Sid informed us that browsers always could (whether it's a redirect, embedded iframe, XHR request, etc.) which is why it was closed -- Sid, can you confirm?

It would seem to me that browsers could always determine what site (or browser tab, say) has initiated a request: when I close a browser window, the browser knows which requests to stop making. When the browser receives a response to an HTTP request, it knows which DOM gets the corresponding JavaScript events or which frame to load the parsed page into.

Thanks,
Nick
Received on Friday, 9 March 2012 22:08:57 UTC