W3C home > Mailing lists > Public > public-webfonts-wg@w3.org > April 2010

Re: Comment on WOFF file format 'origCheckSum' value

From: Jonathan Kew <jfkthame@googlemail.com>
Date: Thu, 29 Apr 2010 16:07:03 +0100
Cc: "public-webfonts-wg@w3.org" <public-webfonts-wg@w3.org>
Message-Id: <AFD2EB1C-8C3F-4412-BA14-02D0C9EAAB43@gmail.com>
To: "Levantovsky, Vladimir" <Vladimir.Levantovsky@MonotypeImaging.com>
Hi Vlad,

Thanks for starting the discussion. A few comments that come to mind:

> Currently, the WOFF table directory contains ‘origCheckSum’ field [1] that, as I understand it, is simply a duplication of the ‘checkSumAdjustment’ value from the original SFNT file’s ‘head’ table.

No, this is the *per-table* checksum from the SFNT file's table directory. This is separate from the overall font checksum that is stored as checkSumAdjustment in the 'head' table.

> While I understand the reason why this field was made part of the WOFF table directory it, in my opinion, does little to protect original font data (the original checksum and the font data integrity can be evaluated just by running checksum check after decompressing font tables), and it does nothing to protect the WOFF file data itself (it doesn’t cover data entries that are part of the WOFF Header, Extended Metadata, Private Data and Table Directory itself).

It is not really present to provide "protection", merely to ensure that the UA can reconstruct the original SFNT data (or portions of it), including a valid table directory, without being required to recalculate checksums in order to do this. The design of WOFF was intended to allow simple reconstruction of the original SFNT data without imposing a requirement that the UA do specific validation, except checking that the WOFF structure itself is valid. Of course, if UAs wish to do further validation of the font data itself, they are free to do so -- this is equally applicable for data that arrived via WOFF or for raw SFNT data.

>  
> I suggest that the scope of this field should be extended (and if we agree, the field itself may be moved to a different part of the file, e.g. Header). I believe it would be useful to make this field cover both the original font data and the WOFF data, so that when a value is calculated – the  WOFF data is taken into account and, upon running a checksum check, the value of ‘origCheckSum’ would be produced when the value encoded in the WOFF file is summed up with the rest of the WOFF data. This way, we would allow user agents to verify that both the WOFF data and the original SFNT data are intact.

It would be possible to add some kind of checksum for the overall file, if this is seen as important. However, I would be reluctant to *require* that this should be checked by the UA. For one thing, that would make it impossible to retrieve selected portions of the font (because the entire file would be needed for checksumming). Currently, the format is designed to allow a client, if desired, to read individual SFNT tables (or the WOFF metadata) without requiring the entire file to be downloaded. This could be advantageous in the case of large font files, where the UA could examine specific tables in order to decide whether to download and use the rest of the font.

>  
> Also, we should make it clear in the spec that any font data manipulations (such as subsetting) that may affect original checksum values (both on a ‘per table’ basis and overall) are done at the time of content authoring, prior to compressing the resulting font subset as a WOFF file.

Yes, this is correct and could perhaps be made explicit somewhere.

> User agents should not make any attempt to correct the checksums if a mismatch is found, and should simply reject the font file if the checksum doesn’t match.

Whether to reject font files where checksums don't match is a question we can discuss. Making this a requirement means that UAs will be required to do checksum calculation/verification, which imposes a (small) additional processing burden that is otherwise optional.

Note that as far as I am aware, most platform font APIs do NOT currently validate font table checksums or the overall SFNT checksum, and do NOT reject fonts with incorrect checksums. At one point, we included checksum validation in the sanity-checking that Gecko does with downloaded font data, prior to attempting to activate the font, but this led to user complaints because certain fonts failed to work in Firefox, even though they appeared to work fine elsewhere.

Jonathan
Received on Thursday, 29 April 2010 15:07:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 29 April 2010 15:07:51 GMT