Re: TrueType Collections from Christopher Slye on 2014-02-25 (public-webfonts-wg@w3.org from February 2014)

From: Christopher Slye <cslye@adobe.com>
Date: Tue, 25 Feb 2014 22:02:50 +0000
To: "public-webfonts-wg@w3.org Group" <public-webfonts-wg@w3.org>
Message-ID: <B54134D7-724D-47CB-B2B1-DEE94EAA85E9@adobe.com>
Having now spoken to Ken Lunde and David Lemon, I will offer some thoughts here on behalf of Adobe:

Adobe would like to see TTC/OTC have "equal citizenship" and support wherever fonts are used. Although we haven't yet seen a large real-world presence of such fonts in use, we think the increased availability of tools (such as those recently developed and released by Adobe) will bolster awareness, development and use. It's a format Adobe intends to support and build upon for numerous practical reasons, so we think support on the web, as a matter of principle, is the right thing to do.

Having said that, we can't disagree with the prevailing WG opinion, which is that TTC/OTC, today, is an awkward format for web delivery, and that the engineering effort to add support seems out of balance with the benefits. The question which remains for us is whether it's better to invest in engineering effort now while we have the chance. Is the consensus opinion in the WG that deferring support today is not passing up the best chance to support TTC/OTC in WOFF? In other words, what are the chances of getting TTC/OTC support on the web in one, two, or three years?

-Christopher


On Feb 10, 2014, at 10:37 AM, Raph Levien <raph@google.com> wrote:

> Hi WG folks,
> 
>    One of the remaining technical questions to consider is whether to add support for functionality equivalent to TrueType collections to the format. In reviewing the pros and cons, I think there's a pretty strong case for _not_ including TTC's, and I think it would be useful to set down my thoughts.
> 
>    First, on the pro side, I wanted to analyze the use cases. The main engineering question is how much file size saving is possible from serving multiple fonts with some shared tables. I've heard two use cases that are compelling (it is of course possible I'm missing more). The first is multiple styles of a complex script font, with all styles sharing a GSUB table, and the second is Han unification in CJK.
> 
>    A font family has to be carefully designed for all styles to have the same GSUB. In particular, glyph numbering has to be consistent across the styles (of course, this means that cmap can be shared as well). I believe that in general it doesn't make sense for multiple styles to share GPOS, as, in high quality designs, mark positioning will be adjusted for the weights. I looked at a bunch of complex script fonts and found that only in Noto Sans Devanagari was the relative size of the GSUB table significant (it is about 34k out of 125k). However, in the existing design, the regular and bold weights are not glyph-compatible - the font would need to be reengineered to take advantage of such an optimization. In other indic scripts I looked at, the GSUB size is less (Noto Sans Kannada is 5k out of 78k), and in other complex scripts _much_ less (Droid Sans Naskh is 2k out of 89k, and Thai is 294 bytes out of 21k).
> 
>    The other use case is packaging CJK fonts specialized to different locales (simplified Chinese, traditional Chinese, and Japanese) in the same font file. Two observations here: in Web use, it is unusual to require multiple CJK appearances for the same font in the same web page. Exceptions do exist, for example dictionaries. Second, the OpenType variant mechanism is a more modern approach to the same problem. In addition, using OpenType variants is much easier for compatibility - if a browser doesn't support them, you still see completely valid CJK.
> 
>    So my conclusion is that there are valid use cases but that they are not compelling - in practice, you'd only see significant savings for a tiny fraction of web pages.
> 
>    On the "con" side I was concerned about spec complexity and security implications. A more minor concern was format compatibility (we have prototype implementations). It would be nice to not break compatibility, but that said, if there were a real advantage to changing the format, it would be worthwhile.
> 
>    The existing draft basically treats the compressed data as a stream. In a minimal memory footprint environment, it would allow for decompressing a font file in a stream-based, incremental fashion, for the most part. The exception is filling in the checksum values, which would require going back and modifying the header after all tables are processed. However, for many applications the checksums can be considered optional.
> 
>    (One point that I observed while digging into this, not directly relevant to the TTC question but perhaps interesting, is that to enable minimal memory footprint streaming, we'd have to enforce that the loca table follows glyf. This seems reasonable enough to me that I believe I will add it as a requirement for compressors in the spec)
> 
>    A compressed file, by contrast, wouldn't be represented by a sequence of tables, each with a size (as is the present format). Rather, the most natural representation would be (offset, length) pair references to the uncompressed block. A straightforward implementation would just decompress the entire block, then extract tables using these references. Of course, most actual files would reduce to the streamable case, but having a separate code path to analyze that and use more efficient processing sounds (a) more complex, and (b) risky in terms of opening more potential security problems. Already, OpenType Sanitizer does extensive checking to validate that tables don't overlap, etc. Such checking is not necessary in the stream case (though of course sizes still need to be validated).
> 
>    Thus, my conclusion is that the costs in terms of complexity and potential security risk are nontrivial. Thus, I believe we should not try to standardize a method for font file collections with shared tables as part of WOFF.
> 
>    Very happy to hear discussion, especially if I've missed something.
> 
> Raph
>
Received on Tuesday, 25 February 2014 22:03:33 UTC