Re: Average Font Size in the Simulation

> On Oct 7, 2020, at 5:28 AM, Chris Lilley <chris@w3.org> wrote:
> 
> 
> On 2020-10-07 08:30, Myles C. Maxfield wrote:
>>> On Oct 6, 2020, at 6:15 PM, Garret Rieger <grieger@google.com> wrote:
>>> 
>>> During the last call there was a question about what the typical font size was for each language grouping in the simulation. I gathered up the sizes of the fonts used in each language grouping and computed the average file size (as a woff2)
>> Are the fonts actually woff2 files?
> 
> My understanding is that woff2 of the whole font was the 100% baseline against which any savings (or losses!) from PFE were measured. If I have that wrong, please correct me.

It looks like all the fonts in the Google Fonts git repo <https://github.com/google/fonts> are .ttf files, not woff2 files.

Since we’re using “whole font” as the baseline instead of “unicode range,” it stands to reason we’re gathering evidence relative to what most normal websites are doing today: e.g. throwing a font up on a server and referencing it from a CSS file. I don’t think 100% of web fonts in use are woff2 files.

If we also wanted to gather metrics about how well we’re doing compared to what Google Fonts does today (e.g. unicode range sending all WOFF2 fonts) then that would be interesting in addition to what we’re gathering now. But not instead of what we’re gathering now.

> 
> My understanding is also that for patch subset, the initial font download id a woff2 (thus, using Brotli) and that binary patches sent subsequently are also Brotli compressed and make use of the compression dictionary for the initial font download.
> 
> And for glyph byterange, the initial download is a woff2

The initial download is truetype or opentype, not WOFF2. It probably could/will be WOFF2 whenever this thing gets finalized and available to website authors, but there’s more research required about how to make range requests work with compression. I already started this research and I’m confident that it’s not too difficult (Brotli already has the concept of independent blocks inside it) but there’s more research to be done here.

> which is halted once the start of the glyf table (r the charstrings part of the CFF table) is detected; and that the byteranges specified over HTTP relate to the uncompressed font (because HTTP compression is transparent).

Yes, the uncompressed truetype / opentype font, not the uncompressed woff2 font.

> And the byterange downloads do not benefit from the Brotli compression dictionary but each one can use (a separate) Brotli compression step (or zlib compression step) as determined by the server HTTP setup. Yes?

Yes.

> 
>>> weighted by the number of occurrences of each font within that group. Here are the results:
>>> 
>>> Latin, Cyrillic, Greek, and Thai: 31 kb
>>> Arabic and Indic: 47 kb
>>> CJK: 1002 kb
>> It would be great if we could get a breakdown with box-and-whiskers plots rather than just averages.
> Yes, I think that would be a lot more helpful.
> 
> -- 
> Chris Lilley
> @svgeesus
> Technical Director @ W3C
> W3C Strategy Team, Core Web Design
> W3C Architecture & Technology Team, Core Web & Media
> 
> 

Received on Wednesday, 7 October 2020 20:50:38 UTC