W3C home > Mailing lists > Public > public-tracking@w3.org > August 2017

Re: confirm and fingerprinting issues

From: Matthias Schunter (Intel Corporation) <mts-std@schunter.org>
Date: Tue, 22 Aug 2017 12:59:29 +0200
To: public-tracking@w3.org
Message-ID: <a50adadb-6343-f9a4-43e9-541357313344@schunter.org>
Hi Mike,

thanks for the clarification.

I believe your resolution should substantially reduce the fingerprinting

Any other concerns/objections?


On 22.08.2017 11:31, Mike O'Neill wrote:
> Matthias, subresources are already denied making web-wide extensions (by
> Roy's last change). My suggestion is to generalise his sentence to cover
> site-specific also. 
> Mike
> -----Original Message-----
> From: Matthias Schunter (Intel Corporation) [mailto:mts-std@schunter.org] 
> Sent: 22 August 2017 09:39
> To: public-tracking@w3.org
> Subject: Re: confirm and fingerprinting issues
> Hi Mike,
> thanks for the clarification.
> I now (hopefully) understand: Instead of pushing an identifier as a
> whole (9437489), you push individual bits (bit1-0, bit2-1, bit3-1, ...).
> Then querying them gets efficient; only say 32 queries (one per bit)
> needed ;-(
> Thos the "you can only query what you store" approach does not mitigate
> this fingerprinting risk (it is efficient to query 32 bits).
> Your suggested mitigation is to disallow subresources from requesting
> user-granted _site-specific_ exceptions (only the main site is allowed
> to do so). They would still be allowed to request web-wide exceptions
> (where this risk does not seem to exist).
> This seems to be a workable and efficient solution.
> Any thoughts?
> Regards,
> matthias
> PS: Am I right that the main site could still use site-specific UGE
> approach for fingerprinting? Anything we can mitigate for them?
> On 22.08.2017 10:22, Mike O'Neill wrote:
>> Hi Matthias,
>> That is not quite what I meant. The fingerprinting I identified would
> allow
>> the subresource to assign a random number (up to 32 bits long in my
>> example), because there are 32 sub-subresources (lets call them
>> grandchildren of the first-party site):
>> b0.images.schunter.org
>> b1.images.schunter.org
>> b2.images.schunter.org
>>                   .
>>                   .
>>                   .
>> B31.images.schunter.org
>> Each grandchild represents one bit in the 32 bit string.
>> If an exception exists for a particular grandchild, that represents a 0 at
>> that particular bit position
>> Otherwise the value of the bit is 1.
>> The value of each grandchild "bit" is communicated back to
>> images.schunter.org by each grandchild detecting its DNT header (say by
>> reading navigator.doNotTrack), then sending the 1 bit value in a message
>> using the postMessage API.
>> Then images.schunter.org receives all these messages and assembles the
>> original 32 bit string from them.
>> Note, this does not need the confirm call, though it could. Restricting
> the
>> confirm call does not fix the risk because the same information can be
>> obtained via postMessage.
>> This is complicated, but it is just javascript. Once it is done it will be
>> easy to reproduce. It gives subresources the ability to generate UIDs even
>> when they are blocked from using cookies e.g. on Safari. There are already
>> other more complicated methods for doing this in the wild, one of the
>> reasons for Apple's ITB in OS11.
>> Mike
>> -----Original Message-----
>> From: Matthias Schunter (Intel Corporation) [mailto:mts-std@schunter.org] 
>> Sent: 22 August 2017 07:44
>> To: Michael O'Neill <michael.oneill@btinternet.com>;
> public-tracking@w3.org
>> Cc: 'Roy T. Fielding' <fielding@gbiv.com>
>> Subject: Re: confirm and fingerprinting issues
>> Hi Mike,
>> thanks a lot for the analysis of fingerprinting.
>> If I understand correctly, a sub-resource (say images.schunter.org) can
>> obtain an exception for its "tracker7289437923.images.schunter.org"
>> where tracker7289437923 is unique to a user for this subdomain. Since
>> tracker7289437923 is unique, your concern is that by learning that there
>> is a UGE for tracker7289437923, the site knows what user is visiting.
>> I believe that this is not a severe fingerprinting risk for the
>> following reason:
>> Assume that the web-site has registered a table of UGEs
>>   tracker7289437923 	Joe
>>   tracker728laksdjh	Jim
>>   trackerk823982089	Helen
>>   ....
>> In theory, obtaining a line from this table allows fingerprinting.
>> However, our "confirm" API only allows to verify whether a single line
>> exists. I.e. I could indeed confirm whether I am talking to a given user:
>> - if confirm("tracker7289437923.images.schunter.org") is true, then I am
>> talking to Joe.
>> However, using the scheme to fingerprint larger numbers of users seems
>> not really feasible: One needs to call the confirm() API once for each
>> subdomain that corresponds to each potential user:
>>   tracker7289437923 	
>>   tracker728laksdjh	
>>   trackerk823982089	
>>   ....
>> Ensuring this was the rationale (AFAIR) that David Signer insisted that
>> confirm must be called with the exact parameters of the store() call.
>> What do you think? If we agree that there is still a larger risk, we
>> should investigate your potential resolution (which I have not checked
>> in detail yet; since I am not 100% sure I see the risk).
>> Any feedback is welcome!
>> matthias
>> On 21.08.2017 21:19, Michael O'Neill wrote:
>>> I think the web-wide issue is fine with Roy's sentence:
>>> For each of the targets in a web-wide exception, a user agent must not
>> store
>>> the duplets and must reject the promise with a DOMException named
>>> "SecurityError" unless the target domain matches both the
>> document.domain of
>>> the script's responsible document and the document.domain of the
>> top-level
>>> browsing context's active document [HTML5]. This effectively limits the
>> API
>>> for web-wide exceptions to the single target domain of the caller.
>>> This limits web-wide consent to the top-level browsing context which was
>> how
>>> it always was supposed to be.
>>> But as the text is now, a subresource browsing context (aka an iframe)
>> can
>>> still specify a site-specific exception for itself and its own set of
>>> targets. This could be a danger because it allows a third-party
>> subresource
>>> to invisibly create arbitrary exceptions for itself, which it can then
>> use
>>> to fingerprint the user agent. It would do this by creating  a set of
>>> subresource iframes and establishing a UGEs for a random set of them.
>>> For example, subresorce.com loads 32 child  iframes b0.subresource.com,
>>> b1.subresource.com, ..., b31.subresource.com. 
>>> When it exists as a subresource on top-level site example.com for user
>> Alice
>>> it creates a UGE for targets bX.subresource.com, bY.subresource.com,
>> ...,
>>> bZ.subresource.com . i.e. a random 32 bit pattern unique to Alice.
>>> When Alice later revisits example.com DNT:0 will be sent in requests for
>> the
>>> subset of targets specified in the UGE. These subresources can then
>>> communicate back to the parent subresource the value of DNT they have
>>> received, using the postMessage API. Thus subresource.com can recognise
>>> Alice without having to place a third-party cookie. It cannot do this
>> for
>>> sites other than example.com, but it is still a privacy risk.
>>> We do not have a use case for a subresource initiated site-specific UGE,
>> so
>>> why do we need it? the easiest way to fix this is simply to adopt Roy's
>>> wording for all UGEs, not just web-wide ones.
>>> For the other issue, making the confirm call (now called
>>> Navigator.trackingExceptionExists) capable of confirming exceptions for
>>> cookie rule subdomains as Navigator.storeTrackingException does, I
>> suggest
>>> the following derived from Roy's definition of "site" for
>>> storeTrackingException, with a lone "*" illegal:
>>> site
>>> The referring domain scope where an exception should be confirmed:
>>> If site is undefined, null, or the empty string, the referring domain
>> scope
>>> defaults to the [site domain].
>>> Otherwise, the referring domain scope is defined by a domain found in
>> site
>>> that is treated in the same way as the domain parameter to cookies
>>> [RFC6265], allowing subdomains to be included with the prefix "*.". The
>>> value can be set to a fully-qualified right-hand segment of the document
>>> host name, up to one level below TLD. If such a domain scope cannot be
>>> parsed then the user agent must reject the promise with the DOMException
>>> named "SecurityError"
>>> Comments?
>>> Mike
Received on Tuesday, 22 August 2017 10:59:56 UTC

This archive was generated by hypermail 2.3.1 : Friday, 3 November 2017 21:45:39 UTC