Progress on Simulation

Over the past couple of weeks since we last met I've been working towards
getting a final version of the simulation run. Here's what I've gotten done:

   - We generated a new version of the data set internally. This new data
   set is significantly larger (~150 million sequences vs 150,000 sequences in
   the current data set) and has better language coverage outside of latin
   than the previous data set. I'm planning on sending out the new data sent
   out later today.
   - I got set up to run simulations for range request:
      - Generated optimized versions of each font in the library using
      Myle's optimizer tool.
      - Found and fixed issues in the font optimizer script (
      https://github.com/litherum/StreamableFonts/commit/7c78f3cebfa79165a962dfc56962538a1aa1fede
      and https://github.com/litherum/StreamableFonts/pull/3) which
      prevented it from correctly re-ordering glyphs in the font.
      - The current optimizer implementation drops hints in the outputted
      fonts, so I dropped hints from all of the non-optimized fonts so that the
      simulations are making a fair comparison between range request and other
      methods.
      - The optimizer doesn't work for variable fonts at the moment due to
      an issue with flattened composite glyphs not being compatible
with gvar. So
      I also dropped all variable tables from fonts in the library. VF
fonts are
      only a small part of the library so I don't expect this to have
much effect
      on the results.
      - Lastly, the optimizer currently re-orders the notdef glyhph (glyph
      0) which should not be moved. However, I don't believe this should
      significantly impact the results of the simulation, but should be fixed
      eventually.
      - When I send out the new data set I'll include the optimized version
      of the library that I generated.
   - I updated the analyzer to gracefully handle errors during simulations
   of individual sequences. If a method fails for a particular sequence then
   results for all methods on that sequence are dropped from the output.
   Additionally the indices of the failed sequences are stored and written to
   a file at the end of the simulation to aid in debugging. There's currently
   an issue where a small number of patch subset simulations fail since the
   harfbuzz subsetter does not yet support GSUB/GPOS re-packing to fix
   overflowed offsets. Code for failure handling is here:
   https://github.com/googlefonts/PFE-analysis/tree/script_update. I'm
   going to try and get that merged to the main repo today.
   - Since range request simulations need to be run against the optimized
   version of the fonts, but all of the other methods are run against the
   non-optimized version of the fonts we currently end up with two separate
   result files. I wrote a tool which can merge the two separate result files
   back into a single file. That code can also be found in the script_update
   branch here:
   https://github.com/googlefonts/PFE-analysis/tree/script_update.
   - I developed a new method for summarizing the results of the analysis (
   https://github.com/w3c/PFE-analysis/blob/master/tools/summarize_results.py#L180)
   that produces a succinct comparison between methods. I'm going to send out
   a separate email with a detailed writeup on that later today.
   - Finally, I'm currently re-running the simulations using the new data
   set and the optimized font library. Hopefully those runs will finish over
   the weekends and I'll have some results to present for our Monday meeting.

Received on Friday, 11 September 2020 19:33:12 UTC