Incremental transfer subset + binary patch POC from Roderick Sheeter on 2018-08-20 (public-webfonts-wg@w3.org from August 2018)

From: Roderick Sheeter <rsheeter@google.com>
Date: Mon, 20 Aug 2018 13:48:59 -0700
To: WebFonts WG <public-webfonts-wg@w3.org>
Message-ID: <CABscrrE4eL-6UNhTsd5SgA2aSAYO__bAeJ7LZW+kNc--RgFezA@mail.gmail.com>

Good afternoon,

I have some good news for incremental transfer: the Google Compression team
that brought us Brotli has toys that may help, specifically Brotli Shared
Dictionary, and in the future, Brotli Patch Mode. These tools allow us to
compute smaller patches than VCDIFF.

Specifically, Brotli allows us to use the current state as a dictionary.
This allows the compressed target to refer to the current state. To give a
simplified example, instead of storing unchanged bytes just store that you
need N bytes from dictionary at offset M. Patch mode will have enhancements
to help compress things like identical offset shifts as you might get when
compiling code with added/removed functions or similar. This may also be a
win for fonts.

So, if the client tells us what codepoints it has and what it needs, then
we can:

1) Compute current state.
Hb-subset is fast enough to plausibly do this "live", or precomputation
could be used. We should permit the server to respond by patching to any
set of codepoints it likes, not exactly what was requested.
2) Compute desired state.
3) Compute patch to get from current=>desired using a public standardized
patch algorithm.
4) Send patch to client.
The client can then obtain the augmented font by applying the patch.

I built a proof of concept to let us begin to play around with this,
available at https://fonts.gstatic.com/experimental/incxfer_demo. The demo
allows you to add arbitrary text to an initial block and then transfers a
font patch using either VCDIFF or Brotli Shared Dictionary. To give
context, it also displays the size of a WOFF2 of the exact subset needed
(optimal known delivery strategy) and what Google Fonts would transfer
today. All subsetting is done with hb-subset, which doesn't support layout
yet (coming soon!).

It is my hope that this demonstrates that:

1) We can specify incremental transfer in a way that minimizes client
implementation difficulty.
    a. An HTTP interaction, no new protocol.
    b. A generic patch algorithm works fine.
2) The client can avoid needing changes when the font spec changes, it just
needs an implementation of the patch algorithm.
If the client keeps Brotli up to date then it'll have one.
3) We don't need client to add any new libraries, just new versions of
existing ones.
The client does still need code changes to wire everything together. If
patching things doesn't fit the clients model this may be a good chunk of
work.

One of the main things we could lose out on in the demo is WOFF2 glyf
transformation. However, one could subset then apply the woff2 glyf
transformation (for both current and desired). The client would receive the
patch, apply it, and then undo the transform.

Cheers, Rod S.

Received on Monday, 20 August 2018 20:49:39 UTC