Re: confirm and fingerprinting issues from Matthias Schunter (Intel Corporation) on 2017-08-25 (public-tracking@w3.org from August 2017)

From: Matthias Schunter (Intel Corporation) <mts-std@schunter.org>
Date: Fri, 25 Aug 2017 11:49:49 +0200
To: public-tracking@w3.org
Message-ID: <e45db02f-0613-67d6-8f96-7cabe149724a@schunter.org>
Hi Folks,

my goal would be to satisfy the requirements of the industry in this
group ;-)

UNDERSTANDING THE PROBLEM

I would like to enhance our understanding before converging on a design.

I understand the fingerprinting risks of UGE for sub-domains
(b1.site.com, bit2.site.com, ...). Mitigating the risks would be desirable.

I do not understand why this same risk is not there for the main site.

I also do not understand why site-wide exceptions should not be allowed
(I do not see their fingerprinting risk) and I agree that allowing
google to, e.g., ask for a "youtube.com" UGE in an iframe is a desirable
use case.

ENUMERATING POTENTIAL SOLUTIONS

[1. Keep the old design] - allow UGE also for sub-contexts. This is the
default unless we reach consensus to change something.

This would allow web-wide and site-specific.

One way to mitigate the fingerprinting risk of specific sub-domains in
this scenario would be to limit the number of patterns that can be
registered (e.g. 5 patterns should constitute at most 5 bits). This can
be done as an implementation recommendation.

[2. Disallow UGE for sub-contexts] This would IMHO break the google
usecase above, would mitigate the fingerprinting risk for sub-contexts,
would still allow sites to use this bit-wise mechanism for fingerprinting.

[3. Only allow web-wide UGE for sub-contexts] This would mitigate
fingerprinting for site-specific UGE. Since I did not understand why we
wanted to disallow web-wide exceptions, I do not see a downside.


DISCUSSION
- I would appreciate if we continue populating the "understanding" and
"potential solution" perspectives before picking a given solution.

By default (unless we reach a new consensus), we will stick to the old
consensus and will not change the spec fundamentally. If we see a risk,
I would add a note and see what feedback we receive.


Any suggestions/input/feedback/solutions?


Regards,
matthias




On 25.08.2017 10:03, Mike O'Neill wrote:
> Hi Shane,
> 
>  
> 
> Only some industry.
> 
>  
> 
> The trend is for subresources to not be able to set a cookies, i.e.
> Safari, especially the new ITB which effectively creates separate silos
> for subresource cookies, i.e. they are not web-wide. If we allow
> web-wide consent from iframes with no check I doubt most browsers will
> implement it.
> 
>  
> 
> It also fails the “specific” test for lawful consent under EU DP/privacy
> law.
> 
>  
> 
>  
> 
> Mike
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
> *From:*Shane M Wiley [mailto:wileys@oath.com]
> *Sent:* 25 August 2017 04:45
> *To:* Mike O'Neill <michael.oneill@baycloud.com>
> *Cc:* Matthias Schunter (Intel Corporation) <mts-std@schunter.org>;
> public-tracking@w3.org; Roy T. Fielding <fielding@gbiv.com>
> *Subject:* Re: confirm and fingerprinting issues
> 
>  
> 
> Working Group,
> 
>  
> 
> This will likely be a non-starter for industry then.  We specifically
> reviewed this concept when the UGE API was first written and everyone
> was supportive at that time ("NAI Opt-Out Model in reverse").  I'm
> struggling to understand why a domain can set a cookie in an iFrame but
> we're going to restrict setting an exception...?
> 
>  
> 
> - Shane
> 
>  
> 
> On Thu, Aug 24, 2017 at 2:18 PM, Mike O'Neill
> <michael.oneill@baycloud.com <mailto:michael.oneill@baycloud.com>> wrote:
> 
>     Shane.
> 
>      
> 
>     Nope, web-wide was specifically ruled out see:
>      https://w3c.github.io/dnt/drafts/tracking-dnt.html#exception-javascript-api-store
>     6th para before end of section
> 
>      
> 
>     “This effectively limits the API for web-wide exceptions to the
>     single target domain of the caller”
> 
>      
> 
>     What we said Monday was the site-specific in a an iframe was
>     possible, but probably pointless.
> 
>      
> 
>     Mike
> 
>      
> 
>      
> 
>     *From:*Shane M Wiley [mailto:wileys@oath.com <mailto:wileys@oath.com>]
>     *Sent:* 24 August 2017 20:49
>     *To:* Mike O'Neill <michael.oneill@baycloud.com
>     <mailto:michael.oneill@baycloud.com>>
>     *Cc:* Matthias Schunter (Intel Corporation) <mts-std@schunter.org
>     <mailto:mts-std@schunter.org>>; public-tracking@w3.org
>     <mailto:public-tracking@w3.org>; Roy T. Fielding <fielding@gbiv.com
>     <mailto:fielding@gbiv.com>>
>     *Subject:* Re: confirm and fingerprinting issues
> 
>      
> 
>     Mike,
> 
> 
>     Agreed on web-wide for those scenarios but I thought we confirmed on
>     Monday that those would work in an iFramed approach?
> 
>      
> 
>     - Shane
> 
>      
> 
>     On Thu, Aug 24, 2017 at 12:39 PM, Mike O'Neill
>     <michael.oneill@baycloud.com <mailto:michael.oneill@baycloud.com>>
>     wrote:
> 
>         Shane,
> 
>          
> 
>         I think you need the web-wide exceptions for that and they are
>         already disallowed for iframes (unless they are for cookie rule
>         subdomains of the top-level domain). The site-specific exception
>         is pointless for other-origin iframes, other than to specify
>         same-party exceptions. It just lets you set a site-specific
>         exception for the iframe domain when it is later visited as a
>         first party.
> 
>          
> 
>         This way means you do not even have to use the iframes, it is
>         much faster.
> 
>          
> 
>         Mike
> 
>          
> 
>         *From:*Shane M Wiley [mailto:wileys@oath.com
>         <mailto:wileys@oath.com>]
>         *Sent:* 24 August 2017 19:35
>         *To:* Mike O'Neill <michael.oneill@baycloud.com
>         <mailto:michael.oneill@baycloud.com>>
>         *Cc:* Matthias Schunter (Intel Corporation)
>         <mts-std@schunter.org <mailto:mts-std@schunter.org>>;
>         public-tracking@w3.org <mailto:public-tracking@w3.org>; Roy T.
>         Fielding <fielding@gbiv.com <mailto:fielding@gbiv.com>>
>         *Subject:* Re: confirm and fingerprinting issues
> 
>          
> 
>         Mike,
> 
>          
> 
>         But wouldn't this break the industry "opt-in" page concept
>         though (similar to the current "opt-out" iFrame model)?
> 
>          
> 
>         - Shane
> 
>          
> 
>         On Thu, Aug 24, 2017 at 11:05 AM, Mike O'Neill
>         <michael.oneill@baycloud.com
>         <mailto:michael.oneill@baycloud.com>> wrote:
> 
>             While restricting the API to top-level context stops it
>             being used by bad
>             actors (to invisibly fingerprint), it also stops the
>             use-case Shane has
>             identified of being able to assign consent to multiple
>             domains. No longer
>             will it be possible to call the API from an iframe, so top
>             level script will
>             not be able to dynamically create browsing contexts that do
>             that.
> 
>             I think the only way to fix the security weakness is to stop
>             sub-resources
>             using the API, but it is very desirable to still allow the
>             registering of
>             exceptions for other-origin (though same-party) domains.
>             This will be useful
>             not just to larger sites.
> 
>             I think both can be done as long as a check is made that the
>             script-origin
>             controls the other domains. The security and privacy benefit
>             of disallowing
>             subresources using the API far outweighs any threat from
>             first-parties
>             getting it wrong.
> 
>             I spent today amending the API to show how this could be
>             specified using the
>             same-party array:
> 
>             https://w3c.github.io/dnt/drafts/samepartyawareapi.html#exceptions
> 
>             See Section 6. It is in a new file to be web readable. It
>             would be easy to
>             create a PR for it against the master branch.
> 
>             Another possible way to check that script origins control
>             other origins is
>             to use CORS (or fetch) , but this adds round-trips and
>             therefore would be
>             slow. The same-party way will be a lot more efficient.  We
>             could add
>             CORS/fetch as belt and braces if people thought it necessary.
> 
>             Please take the time to consider this before Monday's call.
> 
> 
>             Mike
> 
> 
>             -----Original Message-----
>             From: Matthias Schunter (Intel Corporation)
>             [mailto:mts-std@schunter.org <mailto:mts-std@schunter.org>]
>             Sent: 22 August 2017 11:59
>             To: public-tracking@w3.org <mailto:public-tracking@w3.org>
>             Subject: Re: confirm and fingerprinting issues
> 
>             Hi Mike,
> 
> 
>             thanks for the clarification.
> 
>             I believe your resolution should substantially reduce the
>             fingerprinting
>             isk.
> 
>             Any other concerns/objections?
> 
> 
>             Regards,
>             matthias
> 
> 
> 
>             On 22.08.2017 11 <tel:22.08.2017%2011>:31, Mike O'Neill wrote:
>             > Matthias, subresources are already denied making web-wide
>             extensions (by
>             > Roy's last change). My suggestion is to generalise his
>             sentence to cover
>             > site-specific also.
>             >
>             > Mike
>             >
>             > -----Original Message-----
>             > From: Matthias Schunter (Intel Corporation)
>             [mailto:mts-std@schunter.org <mailto:mts-std@schunter.org>]
>             > Sent: 22 August 2017 09:39
>             > To: public-tracking@w3.org <mailto:public-tracking@w3.org>
>             > Subject: Re: confirm and fingerprinting issues
>             >
>             > Hi Mike,
>             >
>             > thanks for the clarification.
>             >
>             > I now (hopefully) understand: Instead of pushing an
>             identifier as a
>             > whole (9437489), you push individual bits (bit1-0, bit2-1,
>             bit3-1, ...).
>             > Then querying them gets efficient; only say 32 queries
>             (one per bit)
>             > needed ;-(
>             >
>             > Thos the "you can only query what you store" approach does
>             not mitigate
>             > this fingerprinting risk (it is efficient to query 32 bits).
>             >
>             > Your suggested mitigation is to disallow subresources from
>             requesting
>             > user-granted _site-specific_ exceptions (only the main
>             site is allowed
>             > to do so). They would still be allowed to request web-wide
>             exceptions
>             > (where this risk does not seem to exist).
>             >
>             > This seems to be a workable and efficient solution.
>             >
>             > Any thoughts?
>             >
>             >
>             > Regards,
>             > matthias
>             >
>             > PS: Am I right that the main site could still use
>             site-specific UGE
>             > approach for fingerprinting? Anything we can mitigate for
>             them?
>             >
>             >
>             >
>             > On 22.08.2017 10 <tel:22.08.2017%2010>:22, Mike O'Neill wrote:
>             >> Hi Matthias,
>             >>
>             >> That is not quite what I meant. The fingerprinting I
>             identified would
>             > allow
>             >> the subresource to assign a random number (up to 32 bits
>             long in my
>             >> example), because there are 32 sub-subresources (lets
>             call them
>             >> grandchildren of the first-party site):
>             >>
>             >> b0.images.schunter.org <http://b0.images.schunter.org>
>             >> b1.images.schunter.org <http://b1.images.schunter.org>
>             >> b2.images.schunter.org <http://b2.images.schunter.org>
>             >>                   .
>             >>                   .
>             >>                   .
>             >> B31.images.schunter.org <http://B31.images.schunter.org>
>             >>
>             >> Each grandchild represents one bit in the 32 bit string.
>             >>
>             >> If an exception exists for a particular grandchild, that
>             represents a 0
>             at
>             >> that particular bit position
>             >> Otherwise the value of the bit is 1.
>             >>
>             >> The value of each grandchild "bit" is communicated back to
>             >> images.schunter.org <http://images.schunter.org> by each
>             grandchild detecting its DNT header (say by
>             >> reading navigator.doNotTrack), then sending the 1 bit
>             value in a message
>             >> using the postMessage API.
>             >>
>             >> Then images.schunter.org <http://images.schunter.org>
>             receives all these messages and assembles the
>             >> original 32 bit string from them.
>             >>
>             >> Note, this does not need the confirm call, though it
>             could. Restricting
>             > the
>             >> confirm call does not fix the risk because the same
>             information can be
>             >> obtained via postMessage.
>             >>
>             >> This is complicated, but it is just javascript. Once it
>             is done it will
>             be
>             >> easy to reproduce. It gives subresources the ability to
>             generate UIDs
>             even
>             >> when they are blocked from using cookies e.g. on Safari.
>             There are
>             already
>             >> other more complicated methods for doing this in the
>             wild, one of the
>             >> reasons for Apple's ITB in OS11.
>             >>
>             >>
>             >>
>             >> Mike
>             >>
>             >>
>             >>
>             >>
>             >>
>             >>
>             >> -----Original Message-----
>             >> From: Matthias Schunter (Intel Corporation)
>             [mailto:mts-std@schunter.org <mailto:mts-std@schunter.org>]
> 
>             >> Sent: 22 August 2017 07:44
>             >> To: Michael O'Neill <michael.oneill@btinternet.com
>             <mailto:michael.oneill@btinternet.com>>;
>             > public-tracking@w3.org <mailto:public-tracking@w3.org>
>             >> Cc: 'Roy T. Fielding' <fielding@gbiv.com
>             <mailto:fielding@gbiv.com>>
>             >> Subject: Re: confirm and fingerprinting issues
>             >>
>             >> Hi Mike,
>             >>
>             >>
>             >> thanks a lot for the analysis of fingerprinting.
>             >>
>             >> If I understand correctly, a sub-resource (say
>             images.schunter.org <http://images.schunter.org>) can
>             >> obtain an exception for its
>             "tracker7289437923.images.schunter.org
>             <http://tracker7289437923.images.schunter.org>"
>             >> where tracker7289437923 is unique to a user for this
>             subdomain. Since
>             >> tracker7289437923 is unique, your concern is that by
>             learning that there
>             >> is a UGE for tracker7289437923, the site knows what user
>             is visiting.
>             >>
>             >> I believe that this is not a severe fingerprinting risk
>             for the
>             >> following reason:
>             >>
>             >> Assume that the web-site has registered a table of UGEs
>             >>   TRACKERID          NAME
>             >>   tracker7289437923  Joe
>             >>   tracker728laksdjh  Jim
>             >>   trackerk823982089  Helen
>             >>   ....
>             >>
>             >> In theory, obtaining a line from this table allows
>             fingerprinting.
>             >> However, our "confirm" API only allows to verify whether
>             a single line
>             >> exists. I.e. I could indeed confirm whether I am talking
>             to a given user:
>             >> - if confirm("tracker7289437923.images.schunter.org
>             <http://tracker7289437923.images.schunter.org>") is true,
>             then I am
>             >> talking to Joe.
>             >>
>             >> However, using the scheme to fingerprint larger numbers
>             of users seems
>             >> not really feasible: One needs to call the confirm() API
>             once for each
>             >> subdomain that corresponds to each potential user:
>             >>   tracker7289437923
>             >>   tracker728laksdjh
>             >>   trackerk823982089
>             >>   ....
>             >>
>             >> Ensuring this was the rationale (AFAIR) that David Signer
>             insisted that
>             >> confirm must be called with the exact parameters of the
>             store() call.
>             >>
>             >> What do you think? If we agree that there is still a
>             larger risk, we
>             >> should investigate your potential resolution (which I
>             have not checked
>             >> in detail yet; since I am not 100% sure I see the risk).
>             >>
>             >> Any feedback is welcome!
>             >>
>             >> matthias
>             >>
>             >>
>             >>
>             >>
>             >> On 21.08.2017 21 <tel:21.08.2017%2021>:19, Michael
>             O'Neill wrote:
>             >>> I think the web-wide issue is fine with Roy's sentence:
>             >>>
>             >>> For each of the targets in a web-wide exception, a user
>             agent must not
>             >> store
>             >>> the duplets and must reject the promise with a
>             DOMException named
>             >>> "SecurityError" unless the target domain matches both the
>             >> document.domain of
>             >>> the script's responsible document and the
>             document.domain of the
>             >> top-level
>             >>> browsing context's active document [HTML5]. This
>             effectively limits the
>             >> API
>             >>> for web-wide exceptions to the single target domain of
>             the caller.
>             >>>
>             >>> This limits web-wide consent to the top-level browsing
>             context which was
>             >> how
>             >>> it always was supposed to be.
>             >>>
>             >>> But as the text is now, a subresource browsing context
>             (aka an iframe)
>             >> can
>             >>> still specify a site-specific exception for itself and
>             its own set of
>             >>> targets. This could be a danger because it allows a
>             third-party
>             >> subresource
>             >>> to invisibly create arbitrary exceptions for itself,
>             which it can then
>             >> use
>             >>> to fingerprint the user agent. It would do this by
>             creating  a set of
>             >>> subresource iframes and establishing a UGEs for a random
>             set of them.
>             >>>
>             >>> For example, subresorce.com <http://subresorce.com>
>             loads 32 child  iframes b0.subresource.com
>             <http://b0.subresource.com>,
>             >>> b1.subresource.com <http://b1.subresource.com>, ...,
>             b31.subresource.com <http://b31.subresource.com>.
>             >>>
>             >>> When it exists as a subresource on top-level site
>             example.com <http://example.com> for user
>             >> Alice
>             >>> it creates a UGE for targets bX.subresource.com
>             <http://bX.subresource.com>, bY.subresource.com
>             <http://bY.subresource.com>,
>             >> ...,
>             >>> bZ.subresource.com <http://bZ.subresource.com> . i.e. a
>             random 32 bit pattern unique to Alice.
>             >>>
>             >>> When Alice later revisits example.com
>             <http://example.com> DNT:0 will be sent in requests for
>             >> the
>             >>> subset of targets specified in the UGE. These
>             subresources can then
>             >>> communicate back to the parent subresource the value of
>             DNT they have
>             >>> received, using the postMessage API. Thus
>             subresource.com <http://subresource.com> can recognise
>             >>> Alice without having to place a third-party cookie. It
>             cannot do this
>             >> for
>             >>> sites other than example.com <http://example.com>, but
>             it is still a privacy risk.
>             >>>
>             >>> We do not have a use case for a subresource initiated
>             site-specific UGE,
>             >> so
>             >>> why do we need it? the easiest way to fix this is simply
>             to adopt Roy's
>             >>> wording for all UGEs, not just web-wide ones.
>             >>>
>             >>> For the other issue, making the confirm call (now called
>             >>> Navigator.trackingExceptionExists) capable of confirming
>             exceptions for
>             >>> cookie rule subdomains as
>             Navigator.storeTrackingException does, I
>             >> suggest
>             >>> the following derived from Roy's definition of "site" for
>             >>> storeTrackingException, with a lone "*" illegal:
>             >>>
>             >>> site
>             >>> The referring domain scope where an exception should be
>             confirmed:
>             >>> If site is undefined, null, or the empty string, the
>             referring domain
>             >> scope
>             >>> defaults to the [site domain].
>             >>> Otherwise, the referring domain scope is defined by a
>             domain found in
>             >> site
>             >>> that is treated in the same way as the domain parameter
>             to cookies
>             >>> [RFC6265], allowing subdomains to be included with the
>             prefix "*.". The
>             >>> value can be set to a fully-qualified right-hand segment
>             of the document
>             >>> host name, up to one level below TLD. If such a domain
>             scope cannot be
>             >>> parsed then the user agent must reject the promise with
>             the DOMException
>             >>> named "SecurityError"
>             >>>
>             >>> Comments?
>             >>>
>             >>> Mike
>             >>>
>             >>>
>             >>>
>             >>>
>             >>
>             >>
>             >>
>             >
>             >
>             >
> 
> 
> 
>          
> 
>         -- 
> 
>         - Shane
> 
>          
> 
>         Shane Wiley
> 
>         VP, Privacy
> 
>         Oath: A Verizon Company
> 
> 
> 
>      
> 
>     -- 
> 
>     - Shane
> 
>      
> 
>     Shane Wiley
> 
>     VP, Privacy
> 
>     Oath: A Verizon Company
> 
> 
> 
>  
> 
> -- 
> 
> - Shane
> 
>  
> 
> Shane Wiley
> 
> VP, Privacy
> 
> Oath: A Verizon Company
>
Received on Friday, 25 August 2017 09:50:17 UTC