Re: Glyph Closure Scaling

Thanks for running this analysis over the Google Fonts collection. A few
thoughts:

   - Makes sense to me that it's most arabic/indic that appears to have
   poor scaling.
   - You did your analysis by looking at glyph counts, I wonder if the
   results would change much if it instead counted total glyph bytes in the
   closure.
   - For the closure did you use the default layout feature selection, or
   --layout_features=*?
   - Something I think that's missing from this analysis is a comparison to
   what would happen if we were instead requesting just the glyphs we needed
   that came from the result of shaping. That analysis could look something
   like this:
      - Get sample text that uses codepoints from a font.
      - Compute the glyph closure on those codepoints.
      - Compute the exact set of glyphs needed for that specific sequence
      of text (should be smaller then the full closure).
      - Compare the difference between those two. That delta represents the
      unnecessary glyph data that would be sent if we requested codepoints vs
      glyph ids.
      - I suspect for things like Arabic and Indic, that using either
      method you'll end up with a majority of the font for any medium to large
      size piece of text.
   - For Indic even if within the individual scripts we don't get much
   benefit from PFE it's still valuable to use. We have several families which
   we have to serve each script as a separate family (for example:
   https://fonts.google.com/?query=baloo). Ideally we'd like to serve this
   as a single font which contains all the scripts and is progressively
   enriched so that users need only download the data for the script(s) they
   need. Usual techniques like unicode range won't work since there are shared
   codepoint between the scripts and you can get broken rendering if scripts
   are mixed (unicode range selects the shared codepoint from the wrong file).


On Wed, Aug 7, 2019 at 12:25 PM Myles C. Maxfield <mmaxfield@apple.com>
wrote:

>
>
> On Aug 7, 2019, at 12:03 PM, Myles C. Maxfield <mmaxfield@apple.com>
> wrote:
>
> Here’s the data from Google Fonts:
>
> <Screen Shot 2019-08-06 at 10.24.13 PM.png>
>
> <Screen Shot 2019-08-06 at 10.03.43 PM.png>
>
> <Screen Shot 2019-08-06 at 10.05.23 PM.png>
>
>
> The print is kind of small, but the X axis is pretty interesting.
>
> Here’s the same chart, but zoomed in so that the right edge of the graph
> is at 1 megabyte (which would still be a sizable webfont)
>
>
> Things look significantly worse.
>
>
> Looks a bit more difficult than the Windows fonts.
>
> *NTR-Regular.ttf*
> *Telugu*
> 78.736248482
> *Lohit-Bengali.ttf*
> *Bengali*
> 78.831622569
> *TenaliRamakrishna-Regular.ttf*
> *Telugu, Latin*
> 81.910529391
> *Peddana-Regular.ttf*
> *Telugu, Latin*
> 82.017367
> *Ramaraja-Regular.ttf*
> *Telugu, Latin*
> 82.060643793
> *Ponnala-Regular.ttf*
> *Telugu*
> 82.925248156
> *Sitara-Regular.ttf*
> *Devanagari, Latin*
> 83.176943573
> *Sitara-Bold.ttf*
> *Devanagari, Latin*
> 83.176943573
> *Sitara-BoldItalic.ttf*
> *Devanagari, Latin*
> 83.186576710
> *Sitara-Italic.ttf*
> *Devanagari, Latin*
> 83.186576710
> *Amiri-Italic.ttf*
> *Arabic*
> 83.235126624
> *Amiri-BoldItalic.ttf*
> *Arabic*
> 83.304386938
> *Amiri-Regular.ttf*
> *Arabic*
> 83.363365799
> *Amiri-Bold.ttf*
> *Arabic*
> 83.39951432
> *SreeKrushnadevaraya-Regular.ttf*
> *Telugu*
> 85.420147454
> *Suranna-Regular.ttf*
> *Telugu*
> 85.4847986935
> *Taprom.ttf*
> *Khmer*
> 85.498475514
> *Angkor-Regular.ttf*
> *Khmer*
> 85.498475514
> *Timmana-Regular.ttf*
> *Telugu*
> 85.7927372355
> *Chathura-ExtraBold.ttf*
> *Telugu, Latin*
> 86.099921648
> *Chathura-Regular.ttf*
> *Telugu, Latin*
> 86.099921648
> *Chathura-Bold.ttf*
> *Telugu, Latin*
> 86.099921648
> *Chathura-Thin.ttf*
> *Telugu, Latin*
> 86.099921648
> *Chathura-Light.ttf*
> *Telugu, Latin*
> 86.099921648
> *Bokor-Regular.ttf*
> *Khmer*
> 86.153956081
> *Moul.ttf*
> *Khmer*
> 86.153956081
> *Siemreap.ttf*
> *Khmer*
> 86.153956081
> *Dangrek.ttf*
> *Khmer*
> 86.153956081
> *Metal.ttf*
> *Khmer*
> 86.153956081
> *Moulpali.ttf*
> *Khmer*
> 86.153956081
> *Content-Bold.ttf*
> *Khmer*
> 86.153956081
> *Content-Regular.ttf*
> *Khmer*
> 86.153956081
> *Freehand.ttf*
> *Khmer*
> 86.153956081
> *Siemreap.ttf*
> *Khmer*
> 86.153956081
> *Koulen.ttf*
> *Khmer*
> 86.153956081
> *Preahvihear.ttf*
> *Khmer*
> 86.272334086
> *Bayon-Regular.ttf*
> *Khmer*
> 86.272334086
> *Chenla.ttf*
> *Khmer*
> 86.272334086
> *OdorMeanChey.ttf*
> *Khmer*
> 86.272334086
> *Mallanna-Regular.ttf*
> *Telugu*
> 86.441653183
> *Mandali-Regular.ttf*
> *Telugu*
> 86.442915173
> *Dhurjati-Regular.ttf*
> *Telugu*
> 86.442915173
> *Ramabhadra-Regular.ttf*
> *Telugu*
> 86.443724933
>
> On Aug 6, 2019, at 11:58 AM, Levantovsky, Vladimir <
> Vladimir.Levantovsky@monotype.com> wrote:
>
> For a glyphID-based model - the first request could simply be the "whole
> font file" with glyph data zeroed out (which compresses to almost nothing).
> The subsequent request would patch that with the glyphs that are actually
> in use.
>
> -----Original Message-----
> From: mmaxfield@apple.com <mmaxfield@apple.com>
> Sent: Tuesday, August 6, 2019 12:31 PM
> To: Jonathan Kew <jfkthame@gmail.com>
> Cc: public-webfonts-wg@w3.org
> Subject: Re: Glyph Closure Scaling
>
>
>
> On Aug 6, 2019, at 2:34 AM, Jonathan Kew <jfkthame@gmail.com> wrote:
>
> On 05/08/2019 22:03, Myles C. Maxfield wrote:
>
> I was envisioning the range request model would send an early request for
> everything in the font other than the outlines. Percentage-wise, this works
> great for big fonts.
>
>
> That's still two separate requests, isn't it? The client needs to make one
> request to get the font header (which it can assume fits within a
> predetermined reasonable max size); that will tell it how much it needs to
> request in order to get everything up to the outlines.
>
>
> I was envisioning the early request wouldn’t be a range request. Instead,
> it would be a regular request for the whole file, and the browser would
> parse the bytes as they arrive, and close the connection (or stop
> requesting or whatever) when the glyph data is reached inside the file.
> This only makes sense if the glyph data is all at the end of the file.
>
> This idea is based loosely on how <video> streaming works, so I should
> investigate how they solve this particular problem. That being said, I
> don’t think this approach has to work in any one particular way. We can
> (and even should!) try a bunch of different related strategies and see
> which one works the best in practice.
>
>
> So there are two complete round-trips to the server *before* it can begin
> to shape text and determine what glyph ranges it needs to request.
>
> On a sufficiently low-latency connection that might be fine, but I'm
> concerned that it could amount to many milliseconds in plenty of real-world
> cases.
>
> JK
>
>
>
>
>
>

Received on Thursday, 8 August 2019 21:58:46 UTC