Re: Proposed Resolution / Consensus for Monday's call. from Roy T. Fielding on 2017-08-25 (public-tracking@w3.org from August 2017)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Fri, 25 Aug 2017 16:04:25 -0700
To: Matthias Schunter <mts-std@schunter.org>
Cc: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-Id: <0EBCC137-1221-450F-8B61-5D10D3297E34@gbiv.com>
> On Aug 25, 2017, at 7:30 AM, Matthias Schunter (Intel Corporation) <mts-std@schunter.org> wrote:
> 
> Dear TPWG,
> 
> 
> I had a quick chat with Mike. Our proposal is to:
> (a) rollback the editors draft to our original consensus

The only consensus we had was the last CR document.

Personally, I would be a lot more comfortable about this discussion
if Shane's use cases were actually present in the specification instead
of being assumed based on past discussions.  After all, we had a great
number of discussions, and my experience has been that "consensus"
is in the eyes of the beholder.

Shane, do you have those use cases documented?

> (b) suggest to add an implementation recommendation that helps
> mitigating the fingerprinting risk: By limiting the number of
> site-specific UGE that a domain can store, we also limit the capability
> to fingerprint.

I don't think that will work.  The number stored is the number of bits,
so just eight would be enough (when combined with other factors).
We might limit the number of confirmation calls, since a legitimate
use case should only make one or two such calls per script, but
a fingerprinting script could get around the API limitation by
making N embedded requests that simply return the received DNT value.

Note that the WG actually had this discussion before (with Nick, IIRC).
The only protection against fingerprinting (specifically for this
attack) that we could think of is already in the fingerprinting
section (a suggestion to restrict the number and frequency of API
calls).

> Below are more detailed notes.
> 
> Any comments and feedback are welcome!
> 
> Note that we are aware that anyone (including sub-resources) can store
> web-wide exceptions. I suggest to see how the adoption evolves and then
> browsers can determine whether additional checks and balances may be needed.

So, we should remove the limitation that was added two weeks ago?

> Regards,
> matthias
> 
> 
> ------------------8<---
> 
> Original (still valid) consensus:
> - 1st party and third parties
> 	- can ask for web-wide and site-specific UGE
> 	- both for the script origin only

Umm, I don't understand.  The script origin (where the script was
downloaded from) has nothing to do with it.  The "effective script origin"
is the origin presumed by the browser security model, which includes
the scheme, host, and port of the immediate document within which the
script is loaded and running. This corresponds to the "document-origin"
used within the CR spec (if we ignore scheme and port).

David is right: the CR API limits storeSiteSpecificTrackingException
to the script's document domain, not the top-level document's domain:

  "If the document-origin would not be able to set a cookie on the
  domain following the cookie domain rules [RFC6265] (e.g. domain is not
  a right-hand match or is a TLD) then the duplet MUST NOT be entered
  into the database and a SYNTAX_ERR exception SHOULD be thrown."
  https://www.w3.org/TR/tracking-dnt/#exceptions-javascript-api-rqst

whereas I incorrectly translated that to

  "For a site-specific exception, a user agent MUST NOT store the duplets
  and MUST reject the promise with a DOMException named "SecurityError" if
  the script's site domain would not be able to set a cookie on the site
  following the cookie domain rules [RFC6265],"
  https://w3c.github.io/dnt/drafts/tracking-dnt.html#exception-javascript-api-store

which is confusing: it was supposed to be "the script domain", which I
had as a defined term for the document.domain of the script's responsible
document (the currently HTML5ish translation of what we were calling
document-domain in the CR).  Alternatively, we can just say "if the
script would not be able to set a cookie on the site", since the same
origin rules are what constrains a script from doing so.

When a site-specific exception is desired, the site portion of the API
defaults to the top-level browsing context, which is not the same as
the effective script origin if an iframe running the script is being
loaded from a different origin (same-party or third-party).
It was my understanding from the list discussions that this is a specific
use case that the API is designed to support.  I think that was Shane's
opinion, as well.  I even included a paragraph describing it in section 6.3.
Was that use case only supposed to work for web-wide exceptions?

In other words, the use case was that a given site would ask for
a site-specific exception for the following parties, with
each party given an iframe in which to explain their specific
privacy policies (or adherence to some standard) and some form of
script-activated checkmark in each iframe to collect the user's
informed consent for that party.

That won't work for site-specific consent given the API in CR
nor as intended for the current draft. But what is supposed to work?

For example, the use case of a site asking for and collecting
consent within its own browsing context, while only loading information
within third-party frames, will work with the above restrictions.
But only for that specific site (not for same-party sites).

Note that there are no such restrictions in the CR on removing
or confirming a site-specific exception, nor on storing a
web-wide exception.  Any script on any site can store a web-wide
exception that applies to any domain.
https://www.w3.org/TR/tracking-dnt/#exceptions-javascript-api-ww-rqst

Likewise, the use case of a group of same-party sites asking
for and obtaining an exception for multiple third parties upon all
of the same-party sites is very interesting, but not at all
satisfied by the drafts to date.

The way I could see that working is by proposing a new API
that retrieves the current TSR for the effective script origin
(IFF it is the same as the top-level document origin), reads the
same-party array in that TSR, retrieves the TSR from each of those
same-party origins (to verify that they do have the same controller),
and then store [origin, target] duplets for each of those
origin x target combinations.

> Current editors draft:
> - 1st party
> 	- can ask for web-wide and targeted UGE
> 	- both for the script origin only
> - third parties
> 	- can ask (only) for site-specific UGE
> 	- web-wide is not allowed
> 
> Shortcomings of the current draft:
> - site-specific UGE poses fingerprinting risk (Mike)
> - web-wide for sub-element are needed for
>  consent portal (Shane)
> 
> Proposed modifications of the editors draft:
> - Back to original consensus (to address Shane's usage)
> 	- 1st party and third parties
> 		- can ask for web-wide and site-specific UGE
> 		- both for the script origin only

I'd prefer that we clarify the use case, since the above two are
contradictory and wouldn't support Shane's case.

> - Mitigate fingerprinting risk by NOTE that suggests
>     that browsers may limit the number of stored site-specific
>     exceptions per top-level domain.

We already have that section.  We could certainly add more to it.
https://w3c.github.io/dnt/drafts/tracking-dnt.html#privacy.fingerprinting

> Assessment of proposed consensus:
> + A compliance portal (e.g. google) can now register web-wide UGE for
> same party domains (e.g. youtube).

How?  That implies we either remove the restriction on web-wide or
come up with a new API for same-party.

> + The limited number of site-specific user-granted exceptions can
> minimize fingerprinting risk

See above.

> - If web-wide user-granted exceptions are mis-used, additional checks
> and balances may be needed in the future.

Personally, I think it is more valuable to support a portal of exception
granting than it is to protect against misuse of the API (aside from
the fingerprinting attack).  The reason being that use of the API just
to send DNT:0 to a target, without first obtaining legitimate and
informed consent from the user (a process we don't even control),
does nothing other than prove an intent to deceive.  It can be easily
traced by storing the effective script origin and/or document URL
along with each duplet, which is already suggested by the spec, and
doesn't provide any more benefit to the attacker than simply ignoring
DNT entirely.

Hence, my preference is to reiterate that several times in the draft,
instead of placing origin restrictions on storing exceptions, and
try to find ways to limit fingerprinting or information leaks by
limiting the remove and confirm APIs to duplets that were stored by
the same effective script origin.

If sites ever do abuse the API, browsers can trigger an additional
confirmation dialog upon use of the API.  Painful, but possible.

Cheers,

Roy T. Fielding                     <http://roy.gbiv.com/>
Senior Principal Scientist, Adobe   <https://www.adobe.com/>
Received on Friday, 25 August 2017 23:04:50 UTC