- From: Richard Fink <rfink@readableweb.com>
- Date: Wed, 28 Mar 2012 12:20:07 -0400
- To: "'Raph Levien'" <raph@google.com>, <www-font@w3.org>, "'Dave Crossland'" <dcrossland@google.com>, "'Levantovsky, Vladimir'" <Vladimir.Levantovsky@MonotypeImaging.com>
- Message-ID: <002d01cd0cfe$a4aadce0$ee0096a0$@com>
Raph said: >"Many of the ideas, and some particulars of the glyf table compression, are based on >Monotype Imaging's MicroType Express format, which is now available under >open-source and proprietary friendly licensing terms," Kudos to Vlad and Monotype for seeing the light, finally - that there was something to be gained by letting MTX run loose and settling for whatever brownie points (among other opportunities) that might come along with doing that. It is good that MTX is no longer tied to the Windows platform or, better said, that it's clear that it no longer is if it ever actually was. The fog has cleared. I have been sitting on the CPP for EOTFAST for a couple of years, never having made up my mind what to do with it and then just losing track. (As per my arrangement with co-author Philip Taylor, it was my call where and when and how.) There is also a version2 of EOTFAST that was never released which uses a Perl library to add some additional features and error-checking. For what it's worth, I'll make the effort to get it up on the EOTFAST site very soon for whoever wants to make use of it and/or post it elsewhere with related code. I'll post a few notices in prominent spots when I've done so. BTW - thanks to Twardoch for pointing out there's a Windows independent implementation of MTX available: >There already is an open-source implementation of EOT with MicroType Express compression, >it's in the Java source of Google's sfntly library. Raph also said: >The growth in adoption of web fonts over the past two years has been stunning I forget the source, but one estimate - which struck me as reasonable - is that 8% of sites are now using at least one or more web fonts. That IS stunning. And now, the incredibly quick rise of the Mobile Web has bumped us all back to the days before broadband and the miserly counting of bytes-per-page. Any and all tools that address that situation are most welcome. Good Luck. Rich From: Raph Levien [mailto:raph@google.com] Sent: Tuesday, March 27, 2012 6:08 PM To: www-font@w3.org Subject: Announcing new font compression project Greetings, web font enthusiasts. The growth in adoption of web fonts over the past two years has been stunning. One of the reasons holding people back from using web fonts is concern over file size and the delay in text rendering until the font is fully loaded. We believe that better compression of font files will make web fonts even more appealing for designers, and make the user experience better. We also believe that lossless compression is quite practical, and is important because it will be completely transparent to designers and users alike, with no degradation or concerns over reliability and testing. We have been researching a new lossless compression format, and am now releasing it as open source and asking for a public discussion. The code name for the project is "WOFF Ultra Condensed", and the hope is for it to be considered by the W3C as a future evolution of the WOFF standard. To give a flavor of the kind of improvements to expect, running compression over the all fonts in the Google Web Fonts project yields a mean of 26.9% gain compared to WOFF. Large CJK fonts benefit particularly well - as one dramatic example, the Nanum Myeongjo font is 48.5% smaller than the corresponding WOFF. More experiments will follow. The code and documentation of the draft wire format are here: <http://code.google.com/p/font-compression-reference/> http://code.google.com/p/font-compression-reference/ http://wiki.font-compression-reference.googlecode.com/git/img/WOFFUltraConde nsed.pdf http://wiki.font-compression-reference.googlecode.com/git/img/WOFFUltraConde nsedfileformat.pdf The intent of this proposal is to preserve everything that has made WOFF great and successful, just providing better compression. The initial WOFF header, including the metadata features, is completely unchanged from WOFF, with the exception of the signature. There's more documentation inside the project, but here is a brief overview of what's going on inside that makes these levels of compression possible: First, the entropy coding is LZMA, which offers significant gains compared with zlib (gzip). Second, there is preprocessing that removes much of the redundancy in the TrueType format (which was designed for quick random access rather than maximal packing into a stream). Third, the directory header is packed using Huffman coding and a dictionary of common table values, saving over 200 bytes (particularly important for small subsets). There is also a provision for combining multiple tables into a single entropy coding stream, which can save both the CPU time and file size overhead of having many small streams. We consider the format to be lossless, in the sense that the _contents_ of the font file are preserved 100%. That said, the decompressed font is not bit-identical to the source font, as there are many irrelevant details such as padding and redundant ways of encoding the same data (for example, it's perfectly valid, but inefficient to repeat flag bytes in a simple glyph, instead of using the repeat code). A significant amount of the compression is due to stripping these out. One way of thinking about the losslessness guarantee is that running a valid font through compression and decompression should yield exactly the same TTX representation as the original font. Further, we plan to build an extensive test suite to validate this assertion. In this proposal, we've tried to strike a balance between complexity and aggressiveness of compression. The biggest gains by far come from better compression of the glyf table (and eliminating the loca table altogether), so basically this proposal squeezes this table to the maximum. We estimate that somewhere between 0.5% and 1% each can be gained by (1) eliminating lsb's from the hmtx table, and (2) compressing the cmap using a technique similar to CFF. The source code includes compression algorithms for both of these, but we can't be 100% sure about the gains because we haven't written the corresponding decompression code. A big concern is overall spec complexity: We want to make it practical for people to implement, test for conformance, etc. We'd really love to hear people's thoughts on this, in particular, whether it's worth going after every last bit of possible compression. This is an open source project, and we encourage participation from the whole community. I'd also like to thank a number of people who have contributed so far: the compression code is based on sfntly (by the Google Internationalization team), the decompression code is built on top of OTS (the OpenType Sanitizer), and a number of pleasant discussions with Vlad Levantovsky, John Daggett have helped improved it. Many of the ideas, and some particulars of the glyf table compression, are based on Monotype Imaging's MicroType Express format, which is now available under open-source and proprietary friendly licensing terms, see <http://monotypeimaging.com/aboutus/mtx-license.aspx> http://monotypeimaging.com/aboutus/mtx-license.aspx. Also thanks to Kenichi Ishibashi for doing an integration into Chromium so we can test it in real browsers (this will also be released soon). We're looking forward to the discussion! Raph Levien Engineer, Google Web Fonts
Received on Wednesday, 28 March 2012 16:19:56 UTC