RE: CFF table processing for WOFF2? from Levantovsky, Vladimir on 2015-04-27 (public-webfonts-wg@w3.org from April 2015)

From: Levantovsky, Vladimir <Vladimir.Levantovsky@monotype.com>
Date: Mon, 27 Apr 2015 19:22:41 +0000
To: Jungshik Shin (신정식, 申政湜) <jungshik@google.com>, Behdad Esfahbod <behdad@google.com>
CC: Ken Lunde <lunde@adobe.com>, Jonathan Kew <jfkthame@gmail.com>, "WOFF Working Group" <public-webfonts-wg@w3.org>
Message-ID: <f48d9af787bf4e56b0ac9e539013d955@wob-maildb-04.agfamonotype.org>
Hello Jungshik,

It seems that the permissions for this document are limited – can you please share the document so that all group members can get access and see the content?

Thank you,
Vlad


From: Jungshik Shin (신정식, 申政湜) [mailto:jungshik@google.com]
Sent: Friday, April 24, 2015 8:23 PM
To: Behdad Esfahbod
Cc: Ken Lunde; Levantovsky, Vladimir; Jonathan Kew; WOFF Working Group
Subject: Re: CFF table processing for WOFF2?

Hi all,

As Behdad wrote, I tried a few different combinations. Below are what I did and got. The summary table is

https://docs.google.com/a/chromium.org/spreadsheets/d/1jvJ3xUOg6rySiAtiXDvAf7Du43XytEBFJCr5YO12F7c/edit?usp=sharing




1. woff compression
2. woff2 compression
3. desubroutinization followed by woff compression
4. desubroutinzation followed by woff2 compression
5. desubroutinization followed by resubroutization and woff2 compression

desubroutization was done by Behdad's version of fonttools. resubroutization in #5 was done with a custom tool made by Behdad's intern.


I tried the above 5 methods (or subset of them) on a few different fonts:

a. NotoSansKR-Regular.otf (1.002)
b. NotoSansKR-Regular : two different subsets
c. NotoSansTC-Regular (1.002)
d. NotoSansTC-Regular : two different subsets

In all cases, #4 (woff2 + desub) was the smallest followed by #5 and #2. #4 was 88.6 ~ 96.5% of #2 depending on the character repertoire. NotoSansKR with no Hanja ( 'b') got the largest reduction while NotoSansTC (no subset) got the smallest (96.5%)



Even woff compression works better with desub'd otf input in most cases.

@Ken, I also want to try desub-resub with AFDKO (tx and other tools), but haven't managed to. You wrote the following.

 It is very easy to create sets of subroutinized and unsubroutinized CFFs with which to experiment. First, pluck the 'CFF ' table from one of the (subroutinized) Noto Sans CJK fonts to get the former one. Then use "% tx -t1 CFF cidfont.ps<http://cidfont.ps> ; tx -cff cidfont.ps<http://cidfont.ps> CFF.unsub" to create the latter one.

 I haven't looked into it yet. What should I do if I want to resubroutinize CFF table?

Anyway, I'll apply several different combinations to Source series fonts and report back the results.

Thank you,

Jungshik




2015-04-24 20:52 GMT+02:00 Behdad Esfahbod <behdad@google.com<mailto:behdad@google.com>>:
Thanks Ken.  Jungshik already tried some of these (original, desubroutinized, desubroutinized and resubroutinized using our internal subroutinizer [that I'll open source soon] / otf, woff, woff2) a couple subsets of Noto.  Jungshik, can you please share your findings?

b

On Fri, Apr 24, 2015 at 8:40 AM, Ken Lunde <lunde@adobe.com<mailto:lunde@adobe.com>> wrote:
Behdad,

I suggest that you start by using the "Source" fonts that we make available on GitHub, specifically the Source Han Sans (though you can instead use Noto Sans CJK because their CFFs are identical in terms of the actual glyph data), Source Sans Pro, Source Serif Pro, and Source Code Pro families. This should be enough to provide preliminary results across a variety of glyph sets to determine whether further investigation, which would include a larger corpus of OpenType/CFF fonts, has any merit.

Regards...

-- Ken

> On Apr 24, 2015, at 1:03 AM, Behdad Esfahbod <behdad@google.com<mailto:behdad@google.com>> wrote:
>
> Thanks Ken and Vlad.
>
> The first step would be to get a good corpus of CFF fonts.  After that, we have all the bits and pieces to try.  There are quite a few combinations (~10 to 20), but that's doable in a couple of weeks I would say, assuming I can get help from Jungshik and Rod.
>
> So, Ken, Vlad, which one of you can contribute a CFF corpus for testing purposes for this project?
>
> Cheers,
>
> behdad
>
> On Thu, Apr 23, 2015 at 6:47 AM, Levantovsky, Vladimir <Vladimir.Levantovsky@monotype.com<mailto:Vladimir.Levantovsky@monotype.com>> wrote:
> I think that this is definitely something we need to investigate, and we need to do it soon (as in "now"). The WOFF2 spec is still a working draft so it is not unreasonable to expect it to be changed (sometimes dramatically, as was the case with e.g. layout feature support in CSS Fonts), and the changes like this one won't really put anything in jeopardy - the existing WOFF2 fonts will work fine while the spec and CTS evolve, and the implementations will eventually be updated.
>
> If there are significant gains to be realized due to CFF preprocessing we ought to consider it but, as Jonathan mentioned, the final decision will depend on the tradeoffs between potential benefits of reducing the compressed size vs. possible increase of the uncompressed font size.
>
> Behdad, how long do you think it would take for you to get at least a rough estimates of compression gains and CFF size increase due to de-subroutinization?
>
> Thank you,
> Vlad
>
>
> -----Original Message-----
> From: Jonathan Kew [mailto:jfkthame@gmail.com<mailto:jfkthame@gmail.com>]
> Sent: Thursday, April 23, 2015 2:16 AM
> To: Behdad Esfahbod; WOFF Working Group
> Cc: Ken Lunde; Jungshik Shin
> Subject: Re: CFF table processing for WOFF2?
>
> On 23/4/15 02:03, Behdad Esfahbod wrote:
> > Hi,
> >
> > Is the working group open to adding processed CFF table to WOFF2 or is
> > it too late?  Maybe we can do that in a compatible way?
> >
> > There's already rumour on the net (which we have anecdotically
> > confirmed) that CFF fonts compress better in WOFF2 if desubroutinized.
> > It's unintuitive but makes some sense, if Brotli is great at capturing
> > redundancy, it should perform at least as well as the subroutinization.
>
> This is an interesting possibility, but I do have a concern... unless the decoder can "re-subroutinize" the font (which seems like it would add substantial complexity to the decoder) this has the potential to significantly increase the in-memory footprint of the decoded font. For memory-constrained devices, that might be a poor tradeoff.
>
> (I have no actual statistics to support or deny this. Both the potential savings of compressed size and the resulting change to the uncompressed size would be interesting to know...)
>
> >
> > I have only one transform on top of that in mind right now: drop the
> > offsets to charstrings.  At reconstruction time, split charstring array
> > on endchar.
> >
> > If there is interest I can prototype it and report savings.
> >
> > Cheers,
> >
> > behdad
>
>
>
Received on Monday, 27 April 2015 19:24:13 UTC