Re: Redundant and unnecessary data (was): Transforming hmtx table from Adam Twardoch (List) on 2015-04-27 (public-webfonts-wg@w3.org from April 2015)

From: Adam Twardoch (List) <list.adam@twardoch.com>
Date: Mon, 27 Apr 2015 16:33:39 +0200
To: "Levantovsky, Vladimir" <Vladimir.Levantovsky@monotype.com>
Cc: John Hudson <john@tiro.ca>, David Kuettel <kuettel@google.com>, Behdad Esfahbod <behdad@google.com>, WOFF Working Group <public-webfonts-wg@w3.org>
Message-Id: <38CFC970-427F-42F1-B55D-6E589BACB8EC@twardoch.com>
The current model as implemented in browsers is such that WOFF[2] gets decoded into SFNT and then the browser does "something" with it in order to render it. That "something" heavily depends on the browser and the platform it runs on. The browser may sometimes use a built-in library to do the layout (eg. Firefox with HarfBuzz or the Graphite livrary), or pass the layout step to one of the platform's subsystems (DW, Uniscribe, CoreText etc.). The browser may also perform the rendering itself (eg. Firefox with the SVG table) or pass the data over to the platform. 

In addition, browsers employ libraries such as ots (ot-sanitiser) to prune the SFNT. 

This is all quite sketchy, not really documented or described clearly. 

But I think what's even more problematic is that standalone OpenType attempts to maintain backwards compatibility with quite ancient platforms (.ttfs I make today do render in Windows 3.11!). That by itself is a good thing because standalone OT fonts need to be deployable on tons of platforms. 

But once converted to WOFF2, the font clearly limits its scope of deployability. It's no longer "standalone" but instead is more like "embedded". Some information is irrelevant. 

Yet the OT spec is quite liberal in declaring what's relevant in the fonts. Not many things get deprecated. Also, we sometimes lack information *why* some things are necessary. 

When browser vendors implement font support, they're offered the "full menu" of OT, including the Sunday menu and the linch menu even though it's Thursday evening. So they sometimes build dependancies on things they ideally "shouldn't". 

Today, quite a few vendorsbuild fonts so that they "don't get rejected by ots". I think it's absurd. 

It should be the font vendors who should essentially "say" what they really want to put into the fonts, which are the aspects of OT they care about. 

Then, OS owners could make a counter-proposal where they would say "well but we also need this information for some tech reasons eg for printing". 

These two lists could be compared and the community could then decide "ah, that stuff needs to land in the fonts ultimately but only as a stub, or it could be auto-calculated". 

And only then the browser implementers could write some code that puts in this missing data if it's really needed. 

Finally, the requirement to put this data inside every font could be lifted -- at least within the Web context. 

I agree however that this sounds like something outside the charter of this WG. 

A.

Sent from my mobile phone.

> On 24.04.2015, at 21:27, Levantovsky, Vladimir <Vladimir.Levantovsky@monotype.com> wrote:
> 
> Hi John, all,
> 
> With my WG chair hat off, just as a regular group participant expressing my personal opinion:
> 
> I think we need to clearly distinguish two cases:
> 1) taking a font file as an input into WOFF2 compressor squeezing as much redundancy out of it and, on the other end, having a decompressor produce a fully functional and 100% equivalent (although not binary matching) font file to use, and
> 2) taking a general purpose font file and transforming it into a different, stripped down version of the font file where certain info that was deemed unnecessary for webfont use case had been stripped (thus producing a font file that I would consider to be an input font file for WOFF2).
> 
> I believe that Adam's proposal describes option 2) preprocessing step, where a font file would be optimized for web use but remains for all intent and purposes a generic input font file as far as WOFF2 is concerned. I would argue that 2) is outside of WOFF2 scope since any font vendor can pre-process and optimize their fonts for web use as part of the font production process (we certainly do it at Monotype). Yet, this is not for the WOFF2 encoder to decide what portions of the font data can be lost and never need to be recovered - while the WOFF2 process is not going to produce a binary match we do strive to preserve every bit of font functionality presented to WOFF2 as input.
> 
> Thank you,
> Vlad
> 
> 
> -----Original Message-----
> From: John Hudson [mailto:john@tiro.ca] 
> Sent: Friday, April 24, 2015 2:53 PM
> To: David Kuettel; Levantovsky, Vladimir
> Cc: Behdad Esfahbod; WOFF Working Group
> Subject: Redundant and unnecessary data (was): Transforming hmtx table
> 
>> On 24/04/15 11:25, David Kuettel wrote:
>> 
>> There is a section on the related hdmx table, which Raph recommended 
>> either just compressing or stripping altogether.
> 
> I believe this is getting into the area of pre-Brotli optimisation recommendations, along the lines that Adam Twardoch suggested during the Portland meeting in 2013. There is a lot of data in a typical sfnt font that is either redundant or unnecessary for webfont environments. It makes sense to strip such data, but probably not to do so at the MTX-like Brotli processing stage (unless we were to define two different modes of Brotli compression: normal and super).
> 
> I still like Adam's idea of something like the approach taken to PDF document standards: definitions of sfnt data sets for different purposes, beginning with a webfont optimisation recommendation. As I recall, this was judged outside the existing mandate of the Webfonts WG, but the informal interest group approach doesn't seem to have gained any traction.
> 
> Ideas?
> 
> 
> JH
Received on Monday, 27 April 2015 14:34:09 UTC