Re: DNT-aware JavaScript (ISSUE-84, ACTION-85) from Jonathan Mayer on 2012-01-25 (public-tracking@w3.org from January 2012)

From: Jonathan Mayer <jmayer@stanford.edu>
Date: Wed, 25 Jan 2012 23:34:06 +0100
To: Kevin Smith <kevsmith@adobe.com>
Cc: Thomas Roessler <tlr@w3.org>, David Singer <singer@apple.com>, "tom@mozilla.com" <tom@mozilla.com>, "Roy T. T. Fielding" <fielding@gbiv.com>, "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-Id: <DE63E144-648D-4783-970E-1E9A7CC48D51@stanford.edu>
On Jan 25, 2012, at 11:07 PM, Kevin Smith wrote:

> I understand your logic.  However, I think this is an example of small bang for a lot of buck.  It's an awful lot of work and complication to change "just throw away the data" into "do not send the data" which for a compliant server should be extremely close functionality.  Especially considering that 'collection' as you have described it in other posts would have to happen on the asynchronous request to determine if you should prevent collection.

Scenario 1: The browser sends a truckload of personal information to the server, which is then supposed to delete it.  Scenario 2: The browser sends an IP address, user-agent, referrer, and other protocol information that we impose strict retention/use limits on (since we can't prevent protocol information from getting sent - save using proxies, but that's unrealistic).  Certainly not "extremely close functionality" from a privacy perspective.

> I still support leaving it out.
> 
> -----Original Message-----
> From: Jonathan Mayer [mailto:jmayer@stanford.edu] 
> Sent: Wednesday, January 25, 2012 10:53 PM
> To: Kevin Smith
> Cc: Thomas Roessler; David Singer; tom@mozilla.com; Roy T. T. Fielding; public-tracking@w3.org (public-tracking@w3.org)
> Subject: Re: DNT-aware JavaScript (ISSUE-84, ACTION-85)
> 
> 
> On Jan 25, 2012, at 10:18 PM, Kevin Smith wrote:
> 
>> While functional, I think both of these suggestions are extremely inelegant hacks.  Companies may choose to use a method such as these, but I would not think the W3C would want to formally recommend such an approach.  You don't usually see hacks standardized.  I agree with the two browser developers that we should leave this section out and let companies work around it if they ever need to.
> 
> DNT-aware JavaScript is a frequently proposed use case / called for feature request.  I think it'd be unwise to leave out something implementers want, especially when the approach appears to be counterintuitive for some.
> 
>> I actually don't understand the use case.  It is true that 3rd party code can be embedded directly in a 1st party dom.  However, tracking does not occur unless a request goes out to the 3rd party server.  When would you need to know the DNT status in a situation where you would not make a request to the 3rd party?  If you are going to make a request anyway, just handle it when you do so (ie throw away the data).  Why would you want to make a request just to see if you can make a request?
> 
> If we impose limitations on collection (which I think we should), and if a script phoning information home is collection (which I think it unambiguously is), then the script needs to be DNT-aware.
> 
> As for how that script becomes DNT-aware, in many cases the server could just respond with a different script depending on DNT status.  But there are use cases where you'd want to dynamically load only the DNT status (as below).  For example: Suppose a third-party social widget includes several slow, hefty scripts.  The website may want the browser to cache the scripts to speed rendering and reduce server load.  By asynchronously requesting DNT status the scripts can load from cache, then later become DNT-aware.
> 
>> -----Original Message-----
>> From: Thomas Roessler [mailto:tlr@w3.org]
>> Sent: Wednesday, January 25, 2012 6:07 PM
>> To: Jonathan Mayer; David Singer; tom@mozilla.com
>> Cc: Thomas Roessler; Roy T. T. Fielding; public-tracking@w3.org 
>> (public-tracking@w3.org)
>> Subject: Re: DNT-aware JavaScript (ISSUE-84, ACTION-85)
>> 
>> To follow up to this: Tom Lowenthal, David Singer and I had a parallel discussion a little earlier.
>> 
>> Our conclusion is that, while there are mechanisms to determine the DNT status of another origin without anything else that we discuss here, that gets messy extremely quickly, and requires the participation of the site that is tracking.
>> 
>> Proposal:  DNT status is a property of the origin pair, i.e., the pair of top-level origin and the origin the script is running under.  This, in particular, is the scope for site-specific extensions, for which we would propose an asynchronous API that permits discovery of the DNT status of the current origin pair.
>> 
>> Example: example-times.com has an ad from example-ad.com in an iframe.  The top-level origin is example-times.com.  The origin of a script executing within the iframe is example-ad.com.
>> 
>> A call like this:
>> 	requestSpecificException(function (dntStatus) { ... });
>> 
>> ... permits the ad to request an exception *within the current context*.  If the exception has been granted already, the callback can be executed by the browser without user interaction, and immediately.  Otherwise, the callback is executed once the user has made a choice.
>> 
>> 
>> In the case of several scripts (some retrieved cross-origin) running within one DOM, the scripts executed from additional origins can instantiate an iframe from that additional origin, that they can then communicate with using postMessage, and that can use the exception request API made available by the browser.  That way, the upper level script can discover the additional origin's DNT status without having to access any non-cacheable resources.
>> 
>> 
>> An additional use case we heard in the meetings deals with the question how first parties can discover the exception status of ads they are loading.
>> 
>> Since we are working on the assumption of adding browser features, we can assume HTML5 and Web Messaging to be available. One idea to address this use case might be to actually specify the protocol that is executed between first parties and ads they run on top of Web Messaging, within the specifications that this group develops.  I'd be curious to hear whether the advertising networks and publishers involved here would be interested in that work.
>> 
>> 
>> I'll leave the summary on this level for the moment, but would welcome discussion, and hope to refine this further as we go.
>> 
>> Regards,
>> --
>> Thomas Roessler, W3C  <tlr@w3.org>  (@roessler)
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On 2012-01-25, at 17:17 +0100, Jonathan Mayer wrote:
>> 
>>> 
>>> On Jan 25, 2012, at 4:49 PM, Roy T. Fielding wrote:
>>> 
>>>> On Jan 25, 2012, at 3:05 PM, Jonathan Mayer wrote:
>>>> 
>>>>> Proposed non-normative text:
>>>>> 
>>>>> It is straightforward to make JavaScript aware of DNT status.  In the simplest case, a server can mirror the DNT HTTP header into JavaScript.  For example, in PHP:
>>>>> 
>>>>> <?php
>>>>> header("Content-type: text/javascript"); 
>>>>> if(array_key_exists("HTTP_DNT", $_SERVER))
>>>>> 	echo "var dnt = '';";
>>>>> else
>>>>> 	echo "var dnt = '" . $_SERVER["HTTP_DNT"] . "';"; ?>
>>>> 
>>>> Er, please don't do that on a real site -- javascript injection attacks are fun.
>>> 
>>> <?php
>>> header("Content-type: text/javascript");
>>> header("Vary: DNT");
>>> $validDntHeaders = array("0", "1");
>>> $dntHeader = "";
>>> if(array_key_exists("HTTP_DNT", $_SERVER))
>>> 	if(in_array($_SERVER["HTTP_DNT"], $validDntHeaders, true))
>>> 		$dntHeader = $_SERVER["HTTP_DNT"]
>>> echo "var dnt = '" . $dntHeader . "';"; ?>
>>> 
>>>> In any case, requiring the server to deliver non-cacheable pages is not an option.
>>>> One of the advantages of doing personalization via javascript is 
>>>> that both the page and the javascript can be static and extensively 
>>>> cached.  Hence, this is not a solution to the issue.
>>> 
>>> Why not use Vary: DNT?
>>> 
>>>>> This standard does not include a JavaScript API for DNT status.  A webpage may include scripts from multiple origins; a naive approach (e.g. window.dnt) would give an embedded script the DNT status for the webpage's origin, which may differ from the DNT status for the script's origin.  Providing a DNT status API that accounts for different-origin embedded scripts would introduce implementation challenges for browser developers and script authors and could be a source of fingerprinting information.  Moreover, the first load of a script requires an HTTP request; the server may need to examine the request's DNT header anyways to determine how it may log and use that request.
>>>> 
>>>> I think there is a misunderstanding here.  The only DNT status that 
>>>> this API needs to relate is that of the webpage origin.  The DNT 
>>>> status of the script origin does not matter
>>> 
>>> Websites often embed a third-party script that phones home.  We have to support that use case.
>>> 
>>>> (if it did, then the script origin would be able to handle that when 
>>>> the script was requested and deliver a different script).
>>> 
>>> Yes.  That would be an example of an implementation that is not "the simplest case," and in many cases the better implementation.  But it runs into the same caching issues.
>>> 
>>>> ....Roy
>>>> 
>>> 
>>> 
>>> 
>> 
>> 
>
Received on Wednesday, 25 January 2012 22:35:03 UTC