Re: Incremental transfer subset + binary patch POC from Roderick Sheeter on 2018-08-24 (public-webfonts-wg@w3.org from August 2018)

From: Roderick Sheeter <rsheeter@google.com>
Date: Fri, 24 Aug 2018 14:45:44 -0700
To: "Levantovsky, Vladimir" <Vladimir.Levantovsky@monotype.com>
Cc: WebFonts WG <public-webfonts-wg@w3.org>
Message-ID: <CABscrrGtiF06skSfaQH10MYAUQFAhhX_Z6LyLQ7nZFybaJwjmQ@mail.gmail.com>
WRT optimal, the way it is now is meant to show the absolute best we could
do: if we somehow knew a priori what content the user would view (e.g.
their page walk and contents thereof) we could cut a single "perfect"
subset that covers all that content. The demo is meant to show that a patch
series is a better approximation of that optimal than other current options.

To give another example, the finer we slice Korean or Japanese (
https://developers.googleblog.com/2018/04/google-fonts-launches-korean-support.html)
the closer we approximate what incremental transfer would do. This doesn't
work for anything that makes heavy use of layout, most notably if we cut
Arabic or Indic into a bunch of pieces it's utterly broken. Incremental
Transfer would "just work" for these cases.

Cheers, Rod S.

On Thu, Aug 23, 2018 at 1:31 PM Levantovsky, Vladimir <
Vladimir.Levantovsky@monotype.com> wrote:

> Thank you Rod, this is really cool!
>
> I can’t claim I fully understand [yet] the concept behind the Brotli Patch
> Mode, but the POC demo makes the benefits of incremental transfer quite
> obvious.
>
>
>
> One possible additional selling point might also be to calculate and show
> the cumulative size of all “optimal” woff2 subsets combined, after each
> incremental content update. E.g., we start with the demo text and add
> “HELLO WORLD!” to it, followed by “Проверка” (“Testing” in Russian) – GF
> today would end up sending two subsets (Latin 21.2 KB + Cyrillic 13.8 KB =
> 35KB), incremental updates would produce three patches yielding (5.9KB +
> 1.3KB + 0.8KB =) 8KB of font data, while three “optimal” dynamic subsets
> would result in transferring (5.9KB + 6.7KB + 7.3KB =) 19.9 KB of data.
>
>
>
> Great work!
>
> Thank you,
>
> Vlad
>
>
>
>
>
> *From:* Roderick Sheeter [mailto:rsheeter@google.com]
> *Sent:* Monday, August 20, 2018 4:49 PM
> *To:* WebFonts WG
> *Subject:* Incremental transfer subset + binary patch POC
>
>
>
> Good afternoon,
>
>
>
> I have some good news for incremental transfer: the Google Compression
> team that brought us Brotli has toys that may help, specifically Brotli
> Shared Dictionary, and in the future, Brotli Patch Mode. These tools allow
> us to compute smaller patches than VCDIFF.
>
>
>
> Specifically, Brotli allows us to use the current state as a dictionary.
> This allows the compressed target to refer to the current state. To give a
> simplified example, instead of storing unchanged bytes just store that you
> need N bytes from dictionary at offset M. Patch mode will have enhancements
> to help compress things like identical offset shifts as you might get when
> compiling code with added/removed functions or similar. This may also be a
> win for fonts.
>
>
>
> So, if the client tells us what codepoints it has and what it needs, then
> we can:
>
>
>
> 1) Compute current state.
>
> Hb-subset is fast enough to plausibly do this "live", or precomputation
> could be used. We should permit the server to respond by patching to any
> set of codepoints it likes, not exactly what was requested.
>
> 2) Compute desired state.
>
> 3) Compute patch to get from current=>desired using a public standardized
> patch algorithm.
>
> 4) Send patch to client.
>
> The client can then obtain the augmented font by applying the patch.
>
>
>
> I built a proof of concept to let us begin to play around with this,
> available at https://fonts.gstatic.com/experimental/incxfer_demo
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__fonts.gstatic.com_experimental_incxfer-5Fdemo&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=jb2T9D8Np5j0t1X2JtGDVMxJyD5fvLoEPxzRs46vOK4UfGfOrlVsyuleed6YRZk5&m=aNlVXJtTKkRlneuxLUQuXDTw_yjqSReFvlw5gVn_YDM&s=OUjEsoFArQRZp0jNOMm_uLcGHvS3Kgc-mbjQ4OfRC1s&e=>.
> The demo allows you to add arbitrary text to an initial block and then
> transfers a font patch using either VCDIFF or Brotli Shared Dictionary. To
> give context, it also displays the size of a WOFF2 of the exact subset
> needed (optimal known delivery strategy) and what Google Fonts would
> transfer today. All subsetting is done with hb-subset, which doesn't
> support layout yet (coming soon!).
>
>
>
> It is my hope that this demonstrates that:
>
>
>
> 1) We can specify incremental transfer in a way that minimizes client
> implementation difficulty.
>
>     a. An HTTP interaction, no new protocol.
>
>     b. A generic patch algorithm works fine.
>
> 2) The client can avoid needing changes when the font spec changes, it
> just needs an implementation of the patch algorithm.
>
> If the client keeps Brotli up to date then it'll have one.
>
> 3) We don't need client to add any new libraries, just new versions of
> existing ones.
>
> The client does still need code changes to wire everything together. If
> patching things doesn't fit the clients model this may be a good chunk of
> work.
>
>
>
> One of the main things we could lose out on in the demo is WOFF2 glyf
> transformation. However, one could subset then apply the woff2 glyf
> transformation (for both current and desired). The client would receive the
> patch, apply it, and then undo the transform.
>
>
>
> Cheers, Rod S.
>
Received on Friday, 24 August 2018 21:46:24 UTC