Re: Request to close ISSUE-81 from Roy T. Fielding on 2011-12-21 (public-tracking@w3.org from December 2011)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Tue, 20 Dec 2011 17:55:25 -0800
To: David Singer <singer@apple.com>
Cc: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-Id: <2D1918E2-1008-40B8-93EB-DE5234234A4C@gbiv.com>
On Dec 20, 2011, at 4:20 PM, David Singer wrote:
> On Dec 20, 2011, at 15:43 , Roy T. Fielding wrote:
>> Have I missed any?
> 
> - detecting when an intermediary node has taken advantage of the 'may' alter outbound requests (but 'must not' alter responses)

Hmm, hard to imagine why the intermediary wouldn't rewrite the
response as well, so this doesn't accomplish detection.
Accessing any site that echoes the request headers inside the
response content would provide more reliable detection.  That
was a common test resource before XSS became a concern.

>> Each of those are *desires* because the protocol will work without
>> them, particularly when backed by effective regulations.  
> 
> I don't know what you mean by 'work'.  As a user, I send off a request to unknown 3rd parties and I get an unknown effect.  Is that 'work'?

Yes, because the user's preference has been expressed.  The server
will either comply with that preference or not -- either way, one
cannot trust the server without behavioral validation because most
of the requirements in the compliance spec are about adhering to
certain behavior long after this request.

>  The user doesn't choose the 3rd parties involved, doesn't know if they have even got around to implementing DNT, doesn't know whether they can or do claim an exception.  Did I miss any other points?

The user also doesn't know if the server's response is truthful.

The user agent does know what sites it is going to make subrequests
for, can choose to check a well-known URI (or even a third-party
verification service) for compliance before doing so, and can pass
in the URI sufficient information to determine exactly what
exception would be claimed, if any.

Or the user can just rely on others to check that for them.

>> That
>> doesn't mean we shouldn't try to satisfy as many desires as possible.
>> It does mean that the cost of satisfying them must be justified by
>> the benefits obtained specifically from those responses (and not
>> from implementation of DNT in general).
>> 
>> To emphasize:  Requiring a response header be added for every
>> single DNT-enabled request on the Internet is an EXTREMELY high
>> cost
> 
> In terms of bandwidth, it's no higher than the cost of the request, in bandwidth, and proportionately to the size of the transaction, responses tend to be larger than requests anyway.  In terms of computation, if a site does no tracking, or always observes DNT with no exceptions, the responses are trivial (can be configured into the web server's config file, typically), and otherwise, the complexity of responding is reasonably proportional to the complexity of the tracking, I think.

I disagree.  There are no trivial additions to HTTP responses
because of the sheer numbers involved and the impact of crossing
segment boundaries.  It is also extremely hard to convince people
to modify working services, aside from critical security fixes.

More importantly -- who are we to make this choice for *them*?
It would have to be implemented by all services, not just those
which implement tracking.

>> even if the servers are very careful to avoid breaking caching
>> (e.g., by including the same response to all requests with or without
>> DNT enabled).  That cost will be entirely borne by the service
>> providers, so they are the ones that will have to buy-into supporting
>> it within this standard.  If that cost is not sufficiently justified
>> by this WG, then I will not implement it and I will request that Adobe
>> formally object to it being part of the standard if the WG insists on
>> specifying it.
> 
> Ah, so the cost you are concerned about is the per-transaction nature of the response, and hence the impact on caching?

There are many costs I am concerned about.

#1 is disabling caching on otherwise cacheable responses, since that
would cause wide-scale network failures if it were to be deployed
over a short period of time, and a ridiculous cost in the long term.

#2 is the cost of making code or configuration changes to every
resource on every server that wants to be compliant with DNT.

#3 is sending more bits on the wire that will just be ignored by almost
everyone, all of the time, because there are less than a thousand people
in the entire world that would actively monitor their browser interactions
for DNT compliance (as opposed to passively assuming that it just works,
or avoiding use of the Internet altogether).

If no response is needed, the only resources that need to be changed are
those owned or controlled by organizations performing cross-site tracking,
most of whom are represented here.  We can take this on as a new cost
of doing business, assuming everyone competing in the same market has to
play by the same rules.  Our standard would then have no impact on the
millions of Mom&Pop shops that operate sites on the Internet and who,
aside from using our campaign-based advertising, don't do their own tracking.

If a response is needed but can be limited to a well-known URI, then at
least the cost has a reasonable bound that has the same cost/benefit
ratio of maintaining /robots.txt.  We can write tutorials and sample
code to explain how to implement it, both statically and dynamically.
The trade-off is that those user agents wishing to perform active
monitoring or pre-flight checking of DNT compliance must make one
additional request per site.  I can live with that trade-off, especially
since it also allows for more funky additions like signed policies,
links to access and edit previously stored tracking data, links
to site-managed opt-back-in mechanisms, and the more general benefits
obtained from using REST.

....Roy
Received on Wednesday, 21 December 2011 01:55:51 UTC