Re: Incremental transfer subset + binary patch POC from Roderick Sheeter on 2018-08-27 (public-webfonts-wg@w3.org from August 2018)

From: Roderick Sheeter <rsheeter@google.com>
Date: Mon, 27 Aug 2018 09:47:11 -0700
To: "Levantovsky, Vladimir" <Vladimir.Levantovsky@monotype.com>
Cc: WebFonts WG <public-webfonts-wg@w3.org>
Message-ID: <CABscrrFJ2nn23P=PA_3aunfc47hOnQbMmGtPPuQhVvoqnEVhFA@mail.gmail.com>
Gotcha. I can add the series of perfect subsets you'd use absent knowledge
of the page walk.

On Mon, Aug 27, 2018 at 9:31 AM Levantovsky, Vladimir <
Vladimir.Levantovsky@monotype.com> wrote:

> I think we are both in agreement on what “optimal” means in this demo, and
> this is, in essence, how Monotype’s dynamic subsetting works for every
> content update.
>
> My point was that with three consecutive content updates (as is the case
> in the example I described in my previous email) we would have to produce
> three different “optimal” dynamic subsets, and while they cumulatively end
> up transferring less data compared to the current GF solution, the numbers
> [that showcase the benefits of incremental updates] would speak for
> themselves. (Right now, when an “optimal” subset size is shown being less
> than the size of cumulative patches, it may not be as obvious that
> incremental updates is still a better solution overall.)
>
> To show this, the demo would need to be updated to retain the “optimal”
> subset size for each content change, and show them as cumulative transfer,
> similar to what you now show for GF solution where e.g. Latin and Cyrillic
> character sets are sent as two data blocks.
>
>
>
> Makes sense?
>
> Vlad
>
>
>
>
>
> *From:* Roderick Sheeter [mailto:rsheeter@google.com]
> *Sent:* Friday, August 24, 2018 5:46 PM
> *To:* Levantovsky, Vladimir
> *Cc:* WebFonts WG
> *Subject:* Re: Incremental transfer subset + binary patch POC
>
>
>
> WRT optimal, the way it is now is meant to show the absolute best we could
> do: if we somehow knew a priori what content the user would view (e.g.
> their page walk and contents thereof) we could cut a single "perfect"
> subset that covers all that content. The demo is meant to show that a patch
> series is a better approximation of that optimal than other current options.
>
>
>
> To give another example, the finer we slice Korean or Japanese (
> https://developers.googleblog.com/2018/04/google-fonts-launches-korean-support.html)
> the closer we approximate what incremental transfer would do. This doesn't
> work for anything that makes heavy use of layout, most notably if we cut
> Arabic or Indic into a bunch of pieces it's utterly broken. Incremental
> Transfer would "just work" for these cases.
>
>
>
> Cheers, Rod S.
>
>
>
> On Thu, Aug 23, 2018 at 1:31 PM Levantovsky, Vladimir <
> Vladimir.Levantovsky@monotype.com> wrote:
>
> Thank you Rod, this is really cool!
>
> I can’t claim I fully understand [yet] the concept behind the Brotli Patch
> Mode, but the POC demo makes the benefits of incremental transfer quite
> obvious.
>
>
>
> One possible additional selling point might also be to calculate and show
> the cumulative size of all “optimal” woff2 subsets combined, after each
> incremental content update. E.g., we start with the demo text and add
> “HELLO WORLD!” to it, followed by “Проверка” (“Testing” in Russian) – GF
> today would end up sending two subsets (Latin 21.2 KB + Cyrillic 13.8 KB =
> 35KB), incremental updates would produce three patches yielding (5.9KB +
> 1.3KB + 0.8KB =) 8KB of font data, while three “optimal” dynamic subsets
> would result in transferring (5.9KB + 6.7KB + 7.3KB =) 19.9 KB of data.
>
>
>
> Great work!
>
> Thank you,
>
> Vlad
>
>
>
>
>
> *From:* Roderick Sheeter [mailto:rsheeter@google.com]
> *Sent:* Monday, August 20, 2018 4:49 PM
> *To:* WebFonts WG
> *Subject:* Incremental transfer subset + binary patch POC
>
>
>
> Good afternoon,
>
>
>
> I have some good news for incremental transfer: the Google Compression
> team that brought us Brotli has toys that may help, specifically Brotli
> Shared Dictionary, and in the future, Brotli Patch Mode. These tools allow
> us to compute smaller patches than VCDIFF.
>
>
>
> Specifically, Brotli allows us to use the current state as a dictionary.
> This allows the compressed target to refer to the current state. To give a
> simplified example, instead of storing unchanged bytes just store that you
> need N bytes from dictionary at offset M. Patch mode will have enhancements
> to help compress things like identical offset shifts as you might get when
> compiling code with added/removed functions or similar. This may also be a
> win for fonts.
>
>
>
> So, if the client tells us what codepoints it has and what it needs, then
> we can:
>
>
>
> 1) Compute current state.
>
> Hb-subset is fast enough to plausibly do this "live", or precomputation
> could be used. We should permit the server to respond by patching to any
> set of codepoints it likes, not exactly what was requested.
>
> 2) Compute desired state.
>
> 3) Compute patch to get from current=>desired using a public standardized
> patch algorithm.
>
> 4) Send patch to client.
>
> The client can then obtain the augmented font by applying the patch.
>
>
>
> I built a proof of concept to let us begin to play around with this,
> available at https://fonts.gstatic.com/experimental/incxfer_demo
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__fonts.gstatic.com_experimental_incxfer-5Fdemo&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=jb2T9D8Np5j0t1X2JtGDVMxJyD5fvLoEPxzRs46vOK4UfGfOrlVsyuleed6YRZk5&m=aNlVXJtTKkRlneuxLUQuXDTw_yjqSReFvlw5gVn_YDM&s=OUjEsoFArQRZp0jNOMm_uLcGHvS3Kgc-mbjQ4OfRC1s&e=>.
> The demo allows you to add arbitrary text to an initial block and then
> transfers a font patch using either VCDIFF or Brotli Shared Dictionary. To
> give context, it also displays the size of a WOFF2 of the exact subset
> needed (optimal known delivery strategy) and what Google Fonts would
> transfer today. All subsetting is done with hb-subset, which doesn't
> support layout yet (coming soon!).
>
>
>
> It is my hope that this demonstrates that:
>
>
>
> 1) We can specify incremental transfer in a way that minimizes client
> implementation difficulty.
>
>     a. An HTTP interaction, no new protocol.
>
>     b. A generic patch algorithm works fine.
>
> 2) The client can avoid needing changes when the font spec changes, it
> just needs an implementation of the patch algorithm.
>
> If the client keeps Brotli up to date then it'll have one.
>
> 3) We don't need client to add any new libraries, just new versions of
> existing ones.
>
> The client does still need code changes to wire everything together. If
> patching things doesn't fit the clients model this may be a good chunk of
> work.
>
>
>
> One of the main things we could lose out on in the demo is WOFF2 glyf
> transformation. However, one could subset then apply the woff2 glyf
> transformation (for both current and desired). The client would receive the
> patch, apply it, and then undo the transform.
>
>
>
> Cheers, Rod S.
>
>
Received on Monday, 27 August 2018 16:47:52 UTC