RE: Font Based Fingerprinting Papers

Hi Pete,

The storage policy proposal is for an API which can set limits on storage (all storage, including cookies). Sites can set limits, by including a response header, but if they do not there are defaults.

The defaults allow short duration first party storage (only) access, which last as long as the user is interacting with a site (with some hysteresis). After the timeout all first-party origin accessible storage is deleted.

The idea is to enable session layer state persistence over reasonable time scales, analytics etc. For example you can determine counts for unique visitors, page visit behaviour, traffic source etc. without being able to collect behaviour about individuals over longer periods than a normal session, say about an hour or less.

If the user agrees to a longer duration, or allows some subresources to have access to their own storage, the top level updates the response header (there would also be a page refresh). The user will be alerted, i.e. prompted,  if the response header raises the limits higher than the defaults.

It would be very easy to detect sites that set higher limits without asking, and browsers could enforce the defaults - maybe not at first but eventually. In Europe or places with similar DP/Privacy laws un-trustworthy sites will face legal action even if the browsers do not enforce.

The point I was making about canvas/font enumeration based fingerprinting is that it may be used in the future to replace third-party state based cookie synching. It might be already, though I have seen no evidence of it yet. 

But there is a lot of evidence that third-party cookies are becoming unavailable, especially on iOS but soon also Firefox.

Responding to this companies collecting web activity data are persisting the state in first-party cookies. They give the site owner some script which creates/uses a cookie in the top-level origin, which is the communicated in a request back to the  collecting company (creating an Image object with url parameters usually). Their big problem then is how to correlate the different first-party cookies to the same user, which is where cookie synching comes in. They could correlate pretty well with source IP + user agent string, but perhaps some might want to get better resolution by using a complicated fingerprinting procedure.

Safari ITP 2.1  is dealing with this with the  7 day max expiry for first-party cookies created with script, but in my view 7 days is too long and their technique does not stop regeneration which already happens a lot. Regeneration means the cookies are effectively immortal anyway.

With my proposal regeneration is impossible as long as everybody sticks to the rules or the browser enforce the defaults.

Mike











The attackers problem is how to correlate 

-----Original Message-----
From: Pete Snyder <psnyder@brave.com> 
Sent: 19 April 2019 23:11
To: Mike O'Neill <michael.oneill@baycloud.com>
Cc: Aleecia M McDonald <aleecia@aleecia.com>; public-privacy (W3C mailing list) <public-privacy@w3.org>
Subject: Re: Font Based Fingerprinting Papers

Mile, sorry to push on this, but can you specify: what is your proposal on cookies?  What change would you like to make to a standard?

I’m not against the idea of _also_ trying to reduce the privacy harm of cookies, this is a great goal.  But other than “something”, I’m just not sure what you’re proposing.  Maybe it would be fruitful to start this (distinct) conversation with a proposal, even if its just a jumping off point.

I do think you’re incorrect though in thinking the way of solving FP is to address 1p cookies.  FP is only useful / needed when there is no recognized cookie present.  It _can_ be used for cookie syncing, but even if there was no such thing as cookies, there would still be utility (to trackers) in FP.

Distinct issue: I’ve read your proposals, and they both are interesting, but they both seem to require trusting sites _more_, which is tangental to the main problem on the web, which is how to manage privacy with non-trust worthy sites. (Apologies if Im misunderstanding)

Pete Snyder
{pes,psnyder}@brave.com
Brave Software
Privacy Researcher

> On Apr 20, 2019, at 12:02 AM, Mike O'Neill <michael.oneill@baycloud.com> wrote:
> 
> Fingerprinting could be being used to cookie sync, match 1st party cookies on different sites to the same person. This used to be done with 3rd party cookies but now with ITP etc. they are often not available.
> 
> A few extra bits via fingerprinting, and the time of the page load, maybe is enough to correlate  first party cookies. It only has to be done once per site also, so the computation load is not so important.
> 
> But I think the way to mitigate it is to control 1st party cookies rather than try to stop fingerprinting, like the way Safari's ITP 2.1 is starting to do (7 day expiry for 1st party script placed cookies). 
> 
> Mike West has a proposal that could eventually replace cookies (in many use cases) with a browser generated short duration (1 hour default) token, inaccessible to script. 
> https://github.com/mikewest/http-state-tokens/blob/master/README.md
> 
> I have a maybe complementary one that would enable sites to limit the duration of all browser storage once user interaction stops.
> https://github.com/w3cping/storage-policy
> 
> Mike
> 
> -----Original Message-----
> From: Pete Snyder <psnyder@brave.com> 
> Sent: 19 April 2019 22:07
> To: Aleecia M McDonald <aleecia@aleecia.com>
> Cc: public-privacy (W3C mailing list) <public-privacy@w3.org>
> Subject: Re: Font Based Fingerprinting Papers
> 
> Thanks for sharing this Aleecia.  Its a good paper! Antonine Vastel (the lead author) did an internship with me last summer at Brave, and I can give him a 👍 recommendation for anyone here looking for a privacy researcher.
> 
> However, I don’t think the finding / argument in the paper quite applies in this case, since (presumably, hopefully) changes to the standard would result in changes to implementors (i.e. the common browser cores).  So the mitigation wouldn’t result in people accidentally winding up in unexpectedly small anonymity sets (since all users of the browser[s] would be shifted to the same change).  Does that match your understanding of the situation?
> 
> P.S. the TL;DR; of the paper is that there are a lot of privacy tools that advertise / try to improve web privacy by (for example) blocking a fingerprintable browser characteristic (say, unique details of how chrome does Canvas on a specific piece of hardware).  The authors find that a lot of these tools actually make users more identifiable, because they make narrow changes.  Before installing the tool, the user was in the anonymity set of people using that version of chrome on that hardware.  After installing the tool, they may have blocked access to the canvas FP vector (some privacy benefit), but they’ve shot themselves in the foot because they’re in the much smaller anonymity set of people using the given tool.  The argument is a bit more involved than that, but thats the 9/10ths high level of it.
> 
> But, again, I dont think it applies here, because if all users of an implementation picked up the same mitigation / protection, the anonymity set would strictly increase (i.e. the user would be more private).
> 
> 
> Pete Snyder
> {pes,psnyder}@brave.com
> Brave Software
> Privacy Researcher
> 
>> On Apr 19, 2019, at 10:45 PM, Aleecia M McDonald <aleecia@aleecia.com> wrote:
>> 
>> Marginally relevant paper from Aug 2018 @ USENIX: https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-vastel.pdf
>> 
>> Tl;dr — techniques to obfuscate fingerprinting often harm more than help. 
>> 
>> (Also contains citations to papers quantifying fingerprinting in the wild but I lack time to chase them down)
>> 
>> Title: Fp-Scanner: The Privacy Implications of Browser Fingerprint Inconsistencies 
>> Authors: Antonine Vastel, Pierre Laperdrix, Walter Rudametkin, Romain Rouvoy
>> 
>>  Aleecia
>> 
>> 
>> 
> 
> 
> 

Received on Saturday, 20 April 2019 17:25:40 UTC