RE: Incremental transfer subset + binary patch POC

Thank you Rod, this is really cool!
I can’t claim I fully understand [yet] the concept behind the Brotli Patch Mode, but the POC demo makes the benefits of incremental transfer quite obvious.

One possible additional selling point might also be to calculate and show the cumulative size of all “optimal” woff2 subsets combined, after each incremental content update. E.g., we start with the demo text and add “HELLO WORLD!” to it, followed by “Проверка” (“Testing” in Russian) – GF today would end up sending two subsets (Latin 21.2 KB + Cyrillic 13.8 KB = 35KB), incremental updates would produce three patches yielding (5.9KB + 1.3KB + 0.8KB =) 8KB of font data, while three “optimal” dynamic subsets would result in transferring (5.9KB + 6.7KB + 7.3KB =) 19.9 KB of data.

Great work!
Thank you,
Vlad


From: Roderick Sheeter [mailto:rsheeter@google.com]
Sent: Monday, August 20, 2018 4:49 PM
To: WebFonts WG
Subject: Incremental transfer subset + binary patch POC

Good afternoon,

I have some good news for incremental transfer: the Google Compression team that brought us Brotli has toys that may help, specifically Brotli Shared Dictionary, and in the future, Brotli Patch Mode. These tools allow us to compute smaller patches than VCDIFF.

Specifically, Brotli allows us to use the current state as a dictionary. This allows the compressed target to refer to the current state. To give a simplified example, instead of storing unchanged bytes just store that you need N bytes from dictionary at offset M. Patch mode will have enhancements to help compress things like identical offset shifts as you might get when compiling code with added/removed functions or similar. This may also be a win for fonts.

So, if the client tells us what codepoints it has and what it needs, then we can:

1) Compute current state.
Hb-subset is fast enough to plausibly do this "live", or precomputation could be used. We should permit the server to respond by patching to any set of codepoints it likes, not exactly what was requested.
2) Compute desired state.
3) Compute patch to get from current=>desired using a public standardized patch algorithm.
4) Send patch to client.
The client can then obtain the augmented font by applying the patch.

I built a proof of concept to let us begin to play around with this, available at https://fonts.gstatic.com/experimental/incxfer_demo<https://urldefense.proofpoint.com/v2/url?u=https-3A__fonts.gstatic.com_experimental_incxfer-5Fdemo&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=jb2T9D8Np5j0t1X2JtGDVMxJyD5fvLoEPxzRs46vOK4UfGfOrlVsyuleed6YRZk5&m=aNlVXJtTKkRlneuxLUQuXDTw_yjqSReFvlw5gVn_YDM&s=OUjEsoFArQRZp0jNOMm_uLcGHvS3Kgc-mbjQ4OfRC1s&e=>. The demo allows you to add arbitrary text to an initial block and then transfers a font patch using either VCDIFF or Brotli Shared Dictionary. To give context, it also displays the size of a WOFF2 of the exact subset needed (optimal known delivery strategy) and what Google Fonts would transfer today. All subsetting is done with hb-subset, which doesn't support layout yet (coming soon!).

It is my hope that this demonstrates that:

1) We can specify incremental transfer in a way that minimizes client implementation difficulty.
    a. An HTTP interaction, no new protocol.
    b. A generic patch algorithm works fine.
2) The client can avoid needing changes when the font spec changes, it just needs an implementation of the patch algorithm.
If the client keeps Brotli up to date then it'll have one.
3) We don't need client to add any new libraries, just new versions of existing ones.
The client does still need code changes to wire everything together. If patching things doesn't fit the clients model this may be a good chunk of work.

One of the main things we could lose out on in the demo is WOFF2 glyf transformation. However, one could subset then apply the woff2 glyf transformation (for both current and desired). The client would receive the patch, apply it, and then undo the transform.

Cheers, Rod S.

Received on Thursday, 23 August 2018 20:31:49 UTC