Re: What the font optimizer does

This is very nicely explained.

I clarified the report to indicate the required and optional steps, 
noted that the study used simulated annealing (with a link to your email 
in the references), and noted the possibility of other glyph sorting 
approaches in the "future work" part of the conclusions.

Could I suggest that you drop a lightly tidied version of this text into 
a file called README.md and then close
https://github.com/litherum/StreamableFonts/issues/4

On 2020-10-12 22:54, Myles C. Maxfield wrote:
> Today we resolved that I should write up a bit about what the font 
> optimizer does.
>
> First: I want to draw a distinction between what is required vs what’s 
> optional. The only requirement for the range-request method is that 1) 
> outlines are at the end of the file, and 2) outlines are independent 
> among different glyphs.
>
> (2 isn’t even a hard requirement, but it’s a working assumption I’m 
> under for now. It’s easy to imagine a system that wouldn’t require 
> independent glyph outlines.)
>
> Any piece of font production software that can satisfy these two 
> requirements would produce a font that is compatible with the range 
> request method.
>
> However, it’s possible to do better: we can sort the font file, to 
> keep glyphs that are commonly used close together within the file. 
> There are many approaches to doing this. I’ve tried about a half-dozen 
> methods, and the one that I’ve observed to to do the best is a 
> simulated annealing algorithm. This is the approach the optimizer I 
> wrote uses. But, to be clear: this approach isn’t necessary or 
> required; if someone gets good/better results using another algorithm, 
> they can absolutely use that other algorithm.
>
> Simulated annealing is a sequential algorithm. The algorithm produces 
> a series of state transitions, and the last state in the series is 
> (supposed to be) the best state. For our purposes, each state is a 
> particular ordering of glyphs in the file. A transition between two 
> states could have a few different forms:
>
> 1. Swapping two glyphs
> 2. Rotating a subsequence of glyphs
> 3. Reversing a subsequence of glyphs
>
> Simulated annealing also requires a fitness function, to judge each 
> state on. The fitness function I use is “the number of bytes in the 
> font file we can avoid downloading.” Note that this doesn’t include 
> the sigmoid fitness function from the evaluation report - I wrote this 
> code before that function existed. Picking this fitness function might 
> be related to the fact that evaluation report shows the range request 
> method doing better on byte savings than on the overall sigmoid 
> fitness score.
>
> So the idea is you start with some initial state, then you make a 
> random state transition, and then you decide to “move to” that state 
> or not based on a probability, and then repeat from whichever state 
> you’re at now. If the new state is significantly better than the old 
> state, the probability of moving to that state is higher, and 
> vice-versa. The probability function (the entire curve) is supposed to 
> lower over time, so in the early states you have more of a chance of 
> jumping to a worse state, and at the end you have less of a chance of 
> jumping to a worse state. This is because, in the beginning you want 
> to jump around the search space more to try to find lots of different 
> options (sort of like when humans do “brainstorming” or “spitballing”) 
> but over time you want to hone in on the best option you’ve found so 
> far, and refine it by making fewer and smaller changes (like when 
> humans do “convergence”).
>
> The optimizer I wrote runs a few of these sequential sequences at the 
> same time (because it runs on the GPU which is a big parallel 
> processor, so we can run a few of these in parallel because why not). 
> The best performing initial ordering (“seed”) seems to be a simple 
> sort by frequency count of each glyph, but there a few others that do 
> pretty well (like ordering by highest bigram score, etc. - I can get 
> into more detail about this if you’re interested).
>
> I also measured that swapping two glyphs seems to yield the best 
> results (rather than rotating or reversing), so that’s the state 
> transition operator that’s used.
>
> I also played around with various probability functions and cooling 
> functions, but discovered the best performing one seemed to be the 
> super simple “if a state is better, move there, otherwise don’t.” So 
> that’s the function the optimizer uses. In addition, I couldn’t find 
> any benefit to cooling over time, so the function doesn’t change over 
> time.
>
> (These findings in the previous paragraph were surprising to me, but 
> after doing some more investigation, there is some rough intuition for 
> why they’re true. Given the fitness function I was using, if you pick 
> random orderings, pretty much all of them have almost identical 
> fitnesses. This means that the search space is sort of a vast 
> landscape of mediocrity: there don’t appear to be many lobes or hills 
> or valleys or mountains. So, all you can do are sort of 
> micro-optimizations. There aren’t many “chains" where making one 
> optimization opens up a whole new region for more optimization, which 
> open yet new regions for optimization. The entire space is pretty flat.)
>
> ((Note that the above might change if I switch fitness function and/or 
> forms of state transitions. Again, the optimizer I came up with isn’t 
> required, or even necessarily a /good/ optimizer - it’s just one 
> proof-of-concept that tries to do pretty well.))
>
> I hope that all made sense. If not, please don’t hesitate to ask 
> questions,
> Myles

-- 
Chris Lilley
@svgeesus
Technical Director @ W3C
W3C Strategy Team, Core Web Design
W3C Architecture & Technology Team, Core Web & Media

Received on Wednesday, 14 October 2020 15:51:14 UTC