[whatwg/fetch] Remove Cache-Control and Expires headers from the CORS-safelisted response headers to prevent user tracking (#1128)

Hello everyone!
With my research team, we discovered that **Cache-Control** and **Expires** headers can be abused to calculate the exact time when a resource was put in the browser cache. This can lead to the identification of the user as the precision is up to the second. By removing them from the CORS-safelisted response headers, this effectively prevents the attack.

Our paper detailing the attack will be published PETS and can be found here: [LINK](https://hal.inria.fr/hal-03017222/document). I'll summarize our main findings below.

### Attack in a nutshell
Some resources that are available online can have a fixed expiry date or fixed expiry duration. By knowing how long a resource is supposed to be fresh, it is possible to infer the value contained in the **Date** header in a browser cache even if **Date** is blacklisted and not available to the server.
The formulas that we use to infer when a resource was put in the cache are the following:

- **Date** = fixed expiry date − **Cache-Control:max-age**
- **Date** = **Expires** − fixed expiry duration

### Why it works
When a resource is put into the cache, the browser also stores the complete set of HTTP response headers along with it. When loading a resource, the browser looks at either the Cache-Control or the Expires header to know if a resource must be downloaded again. The problem is that storing these headers in the cache completely fixes their values in time. At any moment, if a resource is fetched from the cache, it will always be accompanied by these headers with their original values. Current browsers 
never update them to reflect how much time has passed since the initial request. Because of this, any website can get access to these safe-listed headers and compute the original caching date of a resource.

### Defense solution
In terms of defense, if **Cache-Control** and **Expires** headers are not returned in the response, an attacker cannot infer when the resource was loaded in the cache. Regarding current browser defenses:

1. Browsers implementing double/tripled-keyed caches are protected from cross-site history sniffing (it's not possible then to use resources with **Access-Control-Allow-Origin** set to "*")
2. No browser is protected regarding same-site history sniffing. If an attacker wants to respawn cookies on her own website for example, it can keep a map of when users loaded specific resources to do just that.

I'd be happy to answer any questions you may have. A lot more details can be found in the [paper](https://hal.inria.fr/hal-03017222/document) we wrote.






-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/fetch/issues/1128

Received on Wednesday, 16 December 2020 10:40:41 UTC