gzip vs. mtx compression ratios

I sat down a little today and played around with font compression.  I
tested compression ratios of gzip vs. MTX, the MicroType Express
algorithm used as part of the Microsoft EOT format.  The spreadsheet
below contains charts and a summary of the data I collected:

https://spreadsheets.google.com/ccc?key=rKT_wNzraVrkXQcKSWb-jTA&hl=en

I used Microsoft's WEFT tool to create unsubsetted EOT versions of each
font and compared it with the gzip-compressed version of the font file.

As Vlad noted before, standard webfonts such as Times New Roman, Arial,
Georgia and Verdana compress to file sizes 20-28% smaller than the size
of files compressed with gzip.  However, fonts in the Cleartype font set
that Microsoft commissioned for Vista are only 11-14% smaller.  These
fonts use less hinting than traditional TrueType fonts and rely on
Cleartype screen rendering to render glyphs clearly.  This is important
because it seems indicative of a trend towards more lightly hinted
TrueType fonts, which would indicate the hint-related compression that
MTX provides won't be needed as much going forward.  But MTX for these
fonts still does a better job than plain gzip compression.

For large CJK fonts however, where compression is most needed, MTX
doesn't provide much beyond straight gzip compression.  For Meiryo, EOT
files are 17% smaller but for other CJK fonts the range was only 2-12%. 
In fact, using bzip2 general compression beat font-specific MTX for
several of these fonts.  These numbers are also probably distorted in
favor of MTX because the fonts were in TrueType collection files (.ttc)
rather than straight .ttf files, so there are extra glyphs in the gzip
file that aren't in the EOT version.

Since the WEFT tool doesn't handle Postscript CFF fonts (.otf)
currently, I tested the MTX compression of these fonts by merging CFF
data from other .otf fonts into a TrueType font, then comparing the
differences in the resulting compressed versions.  The MTX compression
seemed to be around 5% better than straight gzip.  Although this doesn't
account for compressed metrics, I think this is pretty close to the
compression that a version of MTX modified to compress CFF glyph data
would see.

In summary, MTX seems to compress older TrueType fonts well but less so
more modern lightly-hinted fonts.  It is a little bit better than gzip
for CJK fonts but lags behind general bzip2 compression in many
instances.  For .otf fonts it's only slightly better than gzip
compression.

Regards,

John Daggett
Mozilla Japan

Received on Monday, 29 June 2009 07:06:26 UTC