RE: Redundant and unnecessary data (was): Transforming hmtx table

Thank you David and John for your thoughtful replies.
Now, with my WG chair hat back on (☺) I think this could be a great topic for our upcoming F2F discussion, and I am sure that once we nail the CTS planning issues we would find time to discuss this and the future work items that we might want to address in post-WOFF2 chapter.

And yes, I agree that there are quite a few gray areas where WOFF2 pre-processing may be extended but we need to clearly differentiate between the following two possible variants:

-          Removing anything that is redundant that can be reconstructed at the decoding step is fine;

-          Removing anything that is deemed unnecessary and loosing that data altogether is not (IMO, I for one would find it very hard to sell the WOFF2 as a webfont compression technology if it was making its own decision about what is needed and what’s not) – we don’t have rights to modify the data, especially if someone might consider the results to be clearly in the “derivative works” category.

Cheers,
Vlad


From: David Kuettel [mailto:kuettel@google.com]
Sent: Friday, April 24, 2015 4:33 PM
To: John Hudson
Cc: Levantovsky, Vladimir; Behdad Esfahbod; WOFF Working Group
Subject: Re: Redundant and unnecessary data (was): Transforming hmtx table

I would be strongly in favor of option 2 (preprocessing) as well, both for the reasons that Vlad mentioned, but then also given the fact that I think that we as an industry would be able to explore and iterate much faster in decoupling this from the standardization track.

In using fonttools (https://github.com/behdad/fonttools/), we have been able to explore a lot on this front, esp. in optimizing the fonts (reducing the file size) for a given target platform.

The platforms are evolving quickly as well (e.g. higher-resolution screens, DirectWrite, OpenType support, unicode-range, etc), which affect the set of optimizations that one might choose to apply.  As such, I would see it being quite challenging to build both a backwards compatible and future proof specification, and then all agree on it. :)

In regards to exploring this under the Webfonts WG charter after WOFF 2.0, or as a separate parallel effort, that sounds great!  There is a lot that we could all try, share and learn from together.

On Fri, Apr 24, 2015 at 1:27 PM, John Hudson <john@tiro.ca<mailto:john@tiro.ca>> wrote:
Vlad wrote:

> I think we need to clearly distinguish two cases:
1) taking a font file as an input into WOFF2 compressor squeezing as much redundancy out of it and, on the other end, having a decompressor produce a fully functional and 100% equivalent (although not binary matching) font file to use, and
2) taking a general purpose font file and transforming it into a different, stripped down version of the font file where certain info that was deemed unnecessary for webfont use case had been stripped (thus producing a font file that I would consider to be an input font file for WOFF2).

I believe that Adam's proposal describes option 2) preprocessing step, where a font file would be optimized for web use but remains for all intent and purposes a generic input font file as far as WOFF2 is concerned. I would argue that 2) is outside of WOFF2 scope since any font vendor can pre-process and optimize their fonts for web use as part of the font production process (we certainly do it at Monotype). Yet, this is not for the WOFF2 encoder to decide what portions of the font data can be lost and never need to be recovered - while the WOFF2 process is not going to produce a binary match we do strive to preserve every bit of font functionality presented to WOFF2 as input.

I entirely agree that the second case, which is what Adam was proposing, is out of scope of the WOFF2 spec. My question was whether it is out of scope of the Webfonts WG if the charter were extended beyond WOFF2.

I also agree that the two cases need to be clearly distinguished. But I am not sure that they are or, that there might not be grey areas between the kind of optimisation that might be done to prepare a font to be made into a webfont and the kind of optimisation that might be done while making into a webfont. The cached device metrics tables would seem to me to be in this grey area, in that -- in the environments in which webfonts are deployed -- the stripping of those tables would not affect the functional equivalence of the decompressed font file.

It is obviously within our remit to make decisions about that grey area, to push things in it one way or the other. But having lived with non-standard glyph processing and line layout behaviours for twenty years, I can't help wondering what happens to the things that get pushed into the 'higher level protocol' category. :)

J.

Received on Monday, 27 April 2015 13:36:01 UTC