Re: Signals for internal / external usage of site elements (the signals formerly called "1" and "3") from Roy T. Fielding on 2014-01-08 (public-tracking@w3.org from January 2014)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Wed, 8 Jan 2014 00:14:26 -0800
To: Tracking Protection Working Group <public-tracking@w3.org>
Cc: David Singer <singer@apple.com>, Mike O'Neill <michael.oneill@baycloud.com>, "Matthias Schunter (Intel Corporation)" <mts-std@schunter.org>, Brooks Dobbs <Brooks.Dobbs@kbmg.com>
Message-Id: <FA62033A-8452-459B-8CFC-C9871B6DAE8E@gbiv.com>
On Jan 7, 2014, at 9:55 PM, Mike O'Neill wrote:

> Anyway, its removal in the first place was a chairs/editors decision, never requested by the rest of us.

First, the editors' draft doesn't have it because I don't know of any way to express a first/third party design distinction without some notion of the constraints on first vs third (i.e., that's 100% compliance).

Second, there have been dozens of requests to avoid importing definitions into TPE that are not necessary to express the protocol.  Clearly, these definitions are right on the borderline -- the distinctions that this group has made regarding first and third parties are not discernible by either side of the connection, so it is extremely unlikely that any technical implementations of those ownership concepts will be correct. They are regulatory concerns, not technical concerns.

Hence, if at all possible, my preference is to avoid using these terms within the protocol exchange.  This does not prevent them from being used to guide or explain compliance.

> My problem was solely with overloading the definition of tracking with the ambiguous multiple contexts limitation, which I still feel is inappropriate in the TPE.

That is both incorrect and irrelevant to this discussion.  One of the nice
things about closed issues is that we aren't allowed to discuss them again
without new information.

> Mike
> 
>> -----Original Message-----
>> From: David Singer [mailto:singer@apple.com]
>> Sent: 08 January 2014 00:10
>> To: Matthias Schunter (Intel Corporation)
>> Cc: Mike O'Neill; Tracking Protection Working Group; Dobbs, Brooks
>> Subject: Re: Signals for internal / external usage of site elements (the signals formerly called "1" and "3")
>> 
>> I think that the browser being able to tell in a uniform way whether a site was
>> designed as first party or third party, without making any statements about what
>> first party or third party rules it conforms to, is useful.

I will try once again to explain the problem space.

First, we aren't talking about sites.  A browser might care to know whether a given subrequest is to the same origin, same parent domain, or to a resource that is controlled by the same owner as the page that they intended to visit.  Unfortunately, the browser doesn't know what page the user intended to visit, let alone who owns that page.  The initial request target is not necessarily a first party; the first party depends on how the potential action (i.e., link) was presented to the user, not the URI referenced in that action.  This is something that only a user (or regulator acting as a user) can determine.

The fact that the browser has been told to fetch a number of pages and eventually render content within a frame on one of those pages and, as a result, make more subrequests as instructed by an assortment of page rendering choices, driven from some combination of local configuration, standard HTML, CSS, and non-standard scripting, does not in any way inform the browser which of those requests correspond to the intent of the user. In fact, it is quite possible that none of them do (e.g., phishing).

A browser, therefore, has no idea whether a given page ought to be first party, let alone the elements within that page.

Second, if a browser happens to think it is on a first party page and makes a subrequest to a resource that claims to be first party, why would the user think there might be something amiss?  Because of the domain names?  Browsers don't know what parties own/control what domains.  Browsers can't see common branding.  Browsers can't make any of the decisions that would somehow distinguish one resource owner from another even when they are operating on different domains, and there's no guarantee that two different parties can't occupy the same domain (in fact, they often do).

Finally, if by some miracle the browser does manage to distinguish one resource from another as being owned by different parties, and somehow manages to know which of those parties the user actually intended to interact with, and also that this is not a case of joint first parties, then what is it going to do?

Is it going to

  a) make the subrequest with DNT:1 and hope it all works out?
  b) fetch the TSR for this designated resource and inspect its
     info before doing (a)?
  c) check the resource's reputation with some listing service?

Those are the three choices.  Here are the implications:

  (a) it has already decided that informing the user before they are
      tracked is not desired; the 1/3 flag will not be received.

  (b) the TSR will either not exist (no compliance) or contain sufficient
      detail for the browser to make a decision based solely on whether
      it trusts the identified controller -- whether or not the resource
      is designed for first or third party use is irrelevant if the controller
      is trusted (to somehow comply) with DNT:1, and even less relevant if
      the controller isn't trusted.

  (c) the listing service will make a decision for the browser, regardless
      of the first or third party status.

>> Even if a conformance regime is finally conceived that doesn’t make or need the
>> distinction, it’s not harmful for the TPE to enable signalling it for sites using that
>> regime; it’s just not relevant.

It is always harmful to send bytes that aren't used.  In particular, the 1/3 distinction is the main source of variance in the TSR response, which means it effects both the likelihood of simple static implementation and the cacheability of the TSR verification result.  If there is no 1/3 flag, then the vast majority of servers (both first and third party) can use a single TSR for their entire service.

>> As you say, finding a resource that presumed it was in a 1st party context, being
>> used in a 3rd party context, should be a warning flag to the browser that maybe
>> the site is not following the rules for the context it has (probably unwittingly)
>> been placed in.  In the current compliance document, we place much lighter
>> constraints on 1st party than on 3rd, so there may well be a concern for the user
>> here.
>> 
>> These flags don’t link to a specific conformance regime.  I don’t see the problem
>> that Mike sees, honestly, and they really help the user not to be 'flying blind’.

There has been no interest shown by browsers in presenting information to their users, regardless of what the response might be. A user is going to be 'flying blind', no matter how we specify the response, unless they use special tools or extensions for discovery.  What the response does provide is a statement of business practice that can be inspected by such concerned users, advocates, and regulators independent of any specific request.

In terms of verification features, it would be far more useful to active users for a tool to obtain the TSR for all page components and color code (or graph) them by controller/owner identification.  The user can then find the identity that they intended to interact with, see all the other parties that are not the same, and make their own decisions about which ones are intended and which are not.

In contrast, the cost and risk of reporting how each specific resource has been designed to operate are very real concerns for site operators.  It is fairly easy for a data controller to say that they will turn off all tracking (i.e., discard any data about user activity in contexts other than their own first party contexts) for requests with DNT:1.  It is much harder for them to consistently identify and categorize each and every resource on an origin server.

Hence, I don't think the merits of a tracking status value for 1/3 come anywhere near to justifying its cost, both in terms of getting consensus on TPE and in getting implementations of the protocol in practice.  If there is ever a need for that information as a means of explaining compliance, then it can be included in a qualifier along with all of the other explanations of compliance.


Cheers,

Roy T. Fielding                     <http://roy.gbiv.com/>
Senior Principal Scientist, Adobe   <https://www.adobe.com/>
Received on Wednesday, 8 January 2014 08:14:55 UTC