RE: confirm and fingerprinting issues from Mike O'Neill on 2017-08-24 (public-tracking@w3.org from August 2017)

From: Mike O'Neill <michael.oneill@baycloud.com>
Date: Thu, 24 Aug 2017 20:39:27 +0100
To: "'Shane M Wiley'" <wileys@oath.com>
Cc: "'Matthias Schunter $Intel Corporation$'" <mts-std@schunter.org>, <public-tracking@w3.org>, "'Roy T. Fielding'" <fielding@gbiv.com>
Message-ID: <126a01d31d10$b2b7ab60$18270220$@baycloud.com>
Shane, 

 

I think you need the web-wide exceptions for that and they are already disallowed for iframes (unless they are for cookie rule subdomains of the top-level domain). The site-specific exception is pointless for other-origin iframes, other than to specify same-party exceptions. It just lets you set a site-specific exception for the iframe domain when it is later visited as a first party.

 

This way means you do not even have to use the iframes, it is much faster.

 

Mike

 

From: Shane M Wiley [mailto:wileys@oath.com] 
Sent: 24 August 2017 19:35
To: Mike O'Neill <michael.oneill@baycloud.com>
Cc: Matthias Schunter (Intel Corporation) <mts-std@schunter.org>; public-tracking@w3.org; Roy T. Fielding <fielding@gbiv.com>
Subject: Re: confirm and fingerprinting issues

 

Mike,

 

But wouldn't this break the industry "opt-in" page concept though (similar to the current "opt-out" iFrame model)?

 

- Shane

 

On Thu, Aug 24, 2017 at 11:05 AM, Mike O'Neill <michael.oneill@baycloud.com <mailto:michael.oneill@baycloud.com> > wrote:

While restricting the API to top-level context stops it being used by bad
actors (to invisibly fingerprint), it also stops the use-case Shane has
identified of being able to assign consent to multiple domains. No longer
will it be possible to call the API from an iframe, so top level script will
not be able to dynamically create browsing contexts that do that.

I think the only way to fix the security weakness is to stop sub-resources
using the API, but it is very desirable to still allow the registering of
exceptions for other-origin (though same-party) domains. This will be useful
not just to larger sites.

I think both can be done as long as a check is made that the script-origin
controls the other domains. The security and privacy benefit of disallowing
subresources using the API far outweighs any threat from first-parties
getting it wrong.

I spent today amending the API to show how this could be specified using the
same-party array:

https://w3c.github.io/dnt/drafts/samepartyawareapi.html#exceptions

See Section 6. It is in a new file to be web readable. It would be easy to
create a PR for it against the master branch.

Another possible way to check that script origins control other origins is
to use CORS (or fetch) , but this adds round-trips and therefore would be
slow. The same-party way will be a lot more efficient.  We could add
CORS/fetch as belt and braces if people thought it necessary.

Please take the time to consider this before Monday's call.


Mike


-----Original Message-----
From: Matthias Schunter (Intel Corporation) [mailto:mts-std@schunter.org <mailto:mts-std@schunter.org> ]
Sent: 22 August 2017 11:59
To: public-tracking@w3.org <mailto:public-tracking@w3.org> 
Subject: Re: confirm and fingerprinting issues

Hi Mike,


thanks for the clarification.

I believe your resolution should substantially reduce the fingerprinting
isk.

Any other concerns/objections?


Regards,
matthias



On 22.08.2017 11 <tel:22.08.2017%2011> :31, Mike O'Neill wrote:
> Matthias, subresources are already denied making web-wide extensions (by
> Roy's last change). My suggestion is to generalise his sentence to cover
> site-specific also.
>
> Mike
>
> -----Original Message-----
> From: Matthias Schunter (Intel Corporation) [mailto:mts-std@schunter.org <mailto:mts-std@schunter.org> ]
> Sent: 22 August 2017 09:39
> To: public-tracking@w3.org <mailto:public-tracking@w3.org> 
> Subject: Re: confirm and fingerprinting issues
>
> Hi Mike,
>
> thanks for the clarification.
>
> I now (hopefully) understand: Instead of pushing an identifier as a
> whole (9437489), you push individual bits (bit1-0, bit2-1, bit3-1, ...).
> Then querying them gets efficient; only say 32 queries (one per bit)
> needed ;-(
>
> Thos the "you can only query what you store" approach does not mitigate
> this fingerprinting risk (it is efficient to query 32 bits).
>
> Your suggested mitigation is to disallow subresources from requesting
> user-granted _site-specific_ exceptions (only the main site is allowed
> to do so). They would still be allowed to request web-wide exceptions
> (where this risk does not seem to exist).
>
> This seems to be a workable and efficient solution.
>
> Any thoughts?
>
>
> Regards,
> matthias
>
> PS: Am I right that the main site could still use site-specific UGE
> approach for fingerprinting? Anything we can mitigate for them?
>
>
>
> On 22.08.2017 10 <tel:22.08.2017%2010> :22, Mike O'Neill wrote:
>> Hi Matthias,
>>
>> That is not quite what I meant. The fingerprinting I identified would
> allow
>> the subresource to assign a random number (up to 32 bits long in my
>> example), because there are 32 sub-subresources (lets call them
>> grandchildren of the first-party site):
>>
>> b0.images.schunter.org <http://b0.images.schunter.org> 
>> b1.images.schunter.org <http://b1.images.schunter.org> 
>> b2.images.schunter.org <http://b2.images.schunter.org> 
>>                   .
>>                   .
>>                   .
>> B31.images.schunter.org <http://B31.images.schunter.org> 
>>
>> Each grandchild represents one bit in the 32 bit string.
>>
>> If an exception exists for a particular grandchild, that represents a 0
at
>> that particular bit position
>> Otherwise the value of the bit is 1.
>>
>> The value of each grandchild "bit" is communicated back to
>> images.schunter.org <http://images.schunter.org>  by each grandchild detecting its DNT header (say by
>> reading navigator.doNotTrack), then sending the 1 bit value in a message
>> using the postMessage API.
>>
>> Then images.schunter.org <http://images.schunter.org>  receives all these messages and assembles the
>> original 32 bit string from them.
>>
>> Note, this does not need the confirm call, though it could. Restricting
> the
>> confirm call does not fix the risk because the same information can be
>> obtained via postMessage.
>>
>> This is complicated, but it is just javascript. Once it is done it will
be
>> easy to reproduce. It gives subresources the ability to generate UIDs
even
>> when they are blocked from using cookies e.g. on Safari. There are
already
>> other more complicated methods for doing this in the wild, one of the
>> reasons for Apple's ITB in OS11.
>>
>>
>>
>> Mike
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Matthias Schunter (Intel Corporation) [mailto:mts-std@schunter.org <mailto:mts-std@schunter.org> ]

>> Sent: 22 August 2017 07:44
>> To: Michael O'Neill <michael.oneill@btinternet.com <mailto:michael.oneill@btinternet.com> >;
> public-tracking@w3.org <mailto:public-tracking@w3.org> 
>> Cc: 'Roy T. Fielding' <fielding@gbiv.com <mailto:fielding@gbiv.com> >
>> Subject: Re: confirm and fingerprinting issues
>>
>> Hi Mike,
>>
>>
>> thanks a lot for the analysis of fingerprinting.
>>
>> If I understand correctly, a sub-resource (say images.schunter.org <http://images.schunter.org> ) can
>> obtain an exception for its "tracker7289437923.images.schunter.org <http://tracker7289437923.images.schunter.org> "
>> where tracker7289437923 is unique to a user for this subdomain. Since
>> tracker7289437923 is unique, your concern is that by learning that there
>> is a UGE for tracker7289437923, the site knows what user is visiting.
>>
>> I believe that this is not a severe fingerprinting risk for the
>> following reason:
>>
>> Assume that the web-site has registered a table of UGEs
>>   TRACKERID          NAME
>>   tracker7289437923  Joe
>>   tracker728laksdjh  Jim
>>   trackerk823982089  Helen
>>   ....
>>
>> In theory, obtaining a line from this table allows fingerprinting.
>> However, our "confirm" API only allows to verify whether a single line
>> exists. I.e. I could indeed confirm whether I am talking to a given user:
>> - if confirm("tracker7289437923.images.schunter.org <http://tracker7289437923.images.schunter.org> ") is true, then I am
>> talking to Joe.
>>
>> However, using the scheme to fingerprint larger numbers of users seems
>> not really feasible: One needs to call the confirm() API once for each
>> subdomain that corresponds to each potential user:
>>   tracker7289437923
>>   tracker728laksdjh
>>   trackerk823982089
>>   ....
>>
>> Ensuring this was the rationale (AFAIR) that David Signer insisted that
>> confirm must be called with the exact parameters of the store() call.
>>
>> What do you think? If we agree that there is still a larger risk, we
>> should investigate your potential resolution (which I have not checked
>> in detail yet; since I am not 100% sure I see the risk).
>>
>> Any feedback is welcome!
>>
>> matthias
>>
>>
>>
>>
>> On 21.08.2017 21 <tel:21.08.2017%2021> :19, Michael O'Neill wrote:
>>> I think the web-wide issue is fine with Roy's sentence:
>>>
>>> For each of the targets in a web-wide exception, a user agent must not
>> store
>>> the duplets and must reject the promise with a DOMException named
>>> "SecurityError" unless the target domain matches both the
>> document.domain of
>>> the script's responsible document and the document.domain of the
>> top-level
>>> browsing context's active document [HTML5]. This effectively limits the
>> API
>>> for web-wide exceptions to the single target domain of the caller.
>>>
>>> This limits web-wide consent to the top-level browsing context which was
>> how
>>> it always was supposed to be.
>>>
>>> But as the text is now, a subresource browsing context (aka an iframe)
>> can
>>> still specify a site-specific exception for itself and its own set of
>>> targets. This could be a danger because it allows a third-party
>> subresource
>>> to invisibly create arbitrary exceptions for itself, which it can then
>> use
>>> to fingerprint the user agent. It would do this by creating  a set of
>>> subresource iframes and establishing a UGEs for a random set of them.
>>>
>>> For example, subresorce.com <http://subresorce.com>  loads 32 child  iframes b0.subresource.com <http://b0.subresource.com> ,
>>> b1.subresource.com <http://b1.subresource.com> , ..., b31.subresource.com <http://b31.subresource.com> .
>>>
>>> When it exists as a subresource on top-level site example.com <http://example.com>  for user
>> Alice
>>> it creates a UGE for targets bX.subresource.com <http://bX.subresource.com> , bY.subresource.com <http://bY.subresource.com> ,
>> ...,
>>> bZ.subresource.com <http://bZ.subresource.com>  . i.e. a random 32 bit pattern unique to Alice.
>>>
>>> When Alice later revisits example.com <http://example.com>  DNT:0 will be sent in requests for
>> the
>>> subset of targets specified in the UGE. These subresources can then
>>> communicate back to the parent subresource the value of DNT they have
>>> received, using the postMessage API. Thus subresource.com <http://subresource.com>  can recognise
>>> Alice without having to place a third-party cookie. It cannot do this
>> for
>>> sites other than example.com <http://example.com> , but it is still a privacy risk.
>>>
>>> We do not have a use case for a subresource initiated site-specific UGE,
>> so
>>> why do we need it? the easiest way to fix this is simply to adopt Roy's
>>> wording for all UGEs, not just web-wide ones.
>>>
>>> For the other issue, making the confirm call (now called
>>> Navigator.trackingExceptionExists) capable of confirming exceptions for
>>> cookie rule subdomains as Navigator.storeTrackingException does, I
>> suggest
>>> the following derived from Roy's definition of "site" for
>>> storeTrackingException, with a lone "*" illegal:
>>>
>>> site
>>> The referring domain scope where an exception should be confirmed:
>>> If site is undefined, null, or the empty string, the referring domain
>> scope
>>> defaults to the [site domain].
>>> Otherwise, the referring domain scope is defined by a domain found in
>> site
>>> that is treated in the same way as the domain parameter to cookies
>>> [RFC6265], allowing subdomains to be included with the prefix "*.". The
>>> value can be set to a fully-qualified right-hand segment of the document
>>> host name, up to one level below TLD. If such a domain scope cannot be
>>> parsed then the user agent must reject the promise with the DOMException
>>> named "SecurityError"
>>>
>>> Comments?
>>>
>>> Mike
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>







 

-- 

- Shane

 

Shane Wiley

VP, Privacy

Oath: A Verizon Company
Received on Thursday, 24 August 2017 19:40:42 UTC