W3C home > Mailing lists > Public > www-archive@w3.org > September 2011

Brute force javacript/canvas font probing script

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 28 Sep 2011 00:12:35 +0200
To: www-archive@w3.org
Message-ID: <g5f4875i7phdu89kqb51huf69gjde62l34@hive.bjoern.hoehrmann.de>

  I have made a script that goes through a list of font family names,
paints some string onto a canvas and then calculates the CRC32 check-
sum of the pixel data. This is a poor man's font probing mechanism to
emulate plugin APIs where they are not available. Rendering text to a
canvas is pretty much the only plausible use for font enumeration APIs
as the font data can then be retained in a manner that does not break
where the font is not available so letting the user select among his
fonts is fine. Of course there are small scale setups like "everybody
in the office has this font" but they don't matter much.

For a given browser, the results are fully stable, you get the same
checksums every time (not counting updates to the operating system or
graphic drivers or browser upgrades for that matter), they do differ
wildly across browsers though. Interestingly, on my system, Opera has
the same checksum for Helvetica and Lucida Sans, but Internet Explorer
and Firefox have different checksums for the two. There are some where
the checksums are stable across browsers, GulimChe would be an example.

As it is, the script is rather crude. To make it more efficient there
should be customized text to render for each font, something minimal
while keeping the rendering characteristic (mostly with respect to the
fallback font, if there is only one fallback font, and no clever heu-
ristics). I would think even beyond that the renderings could be made
smaller. Further, there is some redundancy in this, having some of the
Microsoft Core Webfonts but not all of the most common ones would pro-
bably be too uncommon to check for every one of them. It would be use-
ful to have co-occurence data from http://panopticlick.eff.org/ but I
suspect they would prefer if you build your own. It's reasonably fast
as it is though, over 500 fonts checked for in under a second. There
are simpler ways to speed it up aswell, like only considering the red
pixels for the checksum instead of all of them.

An interesting question that could be answered by the "panopticlick"
data would be how many fonts you would have to check for explicitly
without losing many bits of identifying information relative to having
the full list directly. I would think it's less than 500, considering
installing individual fonts so they are available in the browser is a
bit uncommon, but they only released data for the whole list so there
is no way to know without obtaining similar data first.

Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Tuesday, 27 September 2011 22:13:09 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:43:50 UTC