Re: Announcing new font compression project from Jonathan Kew on 2012-03-30 (www-font@w3.org from January to March 2012)

From: Jonathan Kew <jonathan@jfkew.plus.com>
Date: Fri, 30 Mar 2012 15:01:51 -0700
To: www-font@w3.org
Message-ID: <4F762D4F.1010602@jfkew.plus.com>

On 3/30/12 2:47 PM, Tab Atkins Jr. wrote:
> On Fri, Mar 30, 2012 at 2:37 PM, John Hudson<tiro@tiro.com>  wrote:
>> On 27/03/12 3:08 PM, Raph Levien wrote:
>>> We consider the format to be lossless, in the sense that the _contents_ of
>>> the font file are preserved 100%. That said, the decompressed font is not
>>> bit-identical to the source font, as there are many irrelevant details such
>>> as padding and redundant ways of encoding the same data (for example, it's
>>> perfectly valid, but inefficient to repeat flag bytes in a simple glyph,
>>> instead of using the repeat code). A significant amount of the compression
>>> is due to stripping these out.
>> I wonder how this compares to the standard of losslessness required by the
>> WOFF spec?
> It's very close.
>
> Taking a virgin file and round-tripping it through the codec gives you
> a file that's rendering-identical, but not bit-identical.  If you
> roundtrip the result, though, the result from the second pass is
> bit-identical to the result from the first.  So it's still possible to
> do checksum/hash-based signing of font files with this format; they
> just have to do a single round-trip through the format first to get to
> a stable state.
I think there would be merit in separating - both for discussion and in 
implementation - the two logical stages that are being done here.

First, there's an "OpenType normalization and optimization" step, which 
would (for example) replace repeated flag bytes with the repeat code, 
use the optimal format for various subtables where there are several 
possible formats with different packing characteristics, etc. The result 
of this process is a valid OpenType font that is rendering-identical to 
the original, and could be used by any OpenType-supporting system as is; 
it would be a reasonable post-processing operation for any OpenType font 
production system, if that system does not itself generate optimized files.

Then, secondly and separately, there's the actual compression and 
repackaging, which takes the optimized OpenType file and turns it into a 
WOFF2 (or whatever) file, from which a bitwise-identical optimized 
OpenType file can be recovered by the decompression process.

JK

Received on Friday, 30 March 2012 22:02:23 UTC