[ISSUE-60] Will a recipient know if it itself is a 1st or 3rd party? from Kevin Smith on 2011-10-14 (public-tracking@w3.org from October 2011)

From: Kevin Smith <kevsmith@adobe.com>
Date: Fri, 14 Oct 2011 14:31:03 -0700
To: "public-tracking@w3.org" <public-tracking@w3.org>
Message-ID: <6E120BECD1FFF142BC26B61F4D994CF30635AEBAD0@nambx07.corp.adobe.com>

I will take a whack at kicking this conversation off. However, I am not really an http protocol expert, so I will be looking for others (ie - browser folks, authors of W3C http standard etc), to straighten me out and fill in the blanks. I have tried to take a very middle of the road approach, so I apologize if this is either too technical, or too basic.

In Boston, we talked about how in some cases such as iFrames, a site or service may not know whether it was 1st or 3rd party. As I thought more about this, I think the problem might actually be much more widespread than iframes. I do not think there is any generic way to determine if even a normal request is a 3rd or 1st party request, because the server does not know what domain or site the user is actually on. The browser knows the site domain as well as the domain of each sub-request, but that information is not passed on to the server. All the server knows is what content was requested, and what domain it was requested from (http_referer). If the referring domain matches the sub-request domain (ie - an image which is being served from the 1st party server), it is probably safe to assume it is a 1st party request. However, the converse certainly does not mean it's a 3rd party request. Many 1st party requests will come from a different domain - For example, 1st party outsourcing, redirects, CDNs, or even the site's landing page (ie, when you come from a search engine). In other words, it's much easier to say "this is a 1st party than "this is not a 1st party", although even that may be inaccurate sometimes.

There are many approaches a site could take to make an educated guess, such as using a different domain for 3rd party services than for 1st party services, or maintaining a list of referring domains on which it considers itself a 1st party, or just by knowing that the service being used is a 1st or 3rd party service etc. However, using any methods I can think of, the guess may sometimes be wrong (such as if a site which is doing its own 1st party tracking is embedded in an iframe).

We could possibly leave it up to the browser to determine whether the request was 1st or 3rd by comparing the request domain with the website's domain (the way they prevent cross site scripting), and then either not sending a DNT header to 1st parties (which means you would lose 1st party response headers), or adding whether the request is 1st or 3rd party to the DNT header. However, as was already mentioned, even when the domains differ, it may still be a 1st party request, and any service receiving a 3rd party header would have all the problems mentioned above in determining whether they were indeed a 3rd party. And providing 1st/3rd party info in the header may violate the browser security model.
Consequently, I do not think its technically feasible to come up with a method or combination of methods that would always accurately determine party. And if it were possible, it is probably outside the scope of this document.

With this in mind, I think the best approach is that we simply don't define how to determine whether a request is 1st or 3rd party. We just define the difference between the two and how a 1st or 3rd party must behave when it receives a DNT request header. Then we leave it to the service to use the approach or combination of approaches that makes the most sense for them.

-kevin

Received on Friday, 14 October 2011 21:31:33 UTC