- From: Roderick Sheeter <rsheeter@google.com>
- Date: Tue, 29 Sep 2020 15:59:00 -0700
- To: Garret Rieger <grieger@google.com>
- Cc: Chris Lilley <chris@w3.org>, "w3c-webfonts-wg (public-webfonts-wg@w3.org)" <public-webfonts-wg@w3.org>
- Message-ID: <CABscrrEopXA2-_gO1k0-GiuyLDjxj0JhNNKAQM6ua9togYJz6g@mail.gmail.com>
Maybe this is too speculative for an eval report but if we manage to get 32 bit glyph ids into a future rev of the font format then PFE can deliver a true pan-unicode font. Noto in one font! On Thu, Sep 17, 2020 at 6:30 PM Garret Rieger <grieger@google.com> wrote: > Thanks this is looking good so far. Just a couple of thoughts I had: > > It might be worth mentioning pan unicode fonts (such as Noto) as a use > case which is not currently well supported by existing font transfer > methods. To date for web usage we have to deliver Noto as a bunch of > separate families and leave it up to the developer to explicitly pick which > ones they might need. For many types of applications is difficult to know > ahead of time what languages may show up in the content so this can be > difficult. For example: > > - A forum where users may be posting in many different languages. > - A mapping application which will need to render text in a wide array > of scripts depending on what part of the world you're viewing. > > PFE can enable the easy and efficient use of a pan unicode font. Something > that's not possible today. > > Section 2.7, this isn't filled out yet, but here's some examples that we > run into on Google Fonts that could be used to demonstrate issues in trying > to subset fonts to improve performance: > > - With indic scripts there are some shared characters between the > scripts. If you have a font which supports two or more of these scripts and > want to present it as a single family and use unicode range to deliver each > script in it's own subset you run into trouble. The shared characters need > to be duplicated in each subset. However, the way unicode range works is > that shared character will be rendered from only one of the subsets based > on the priority of the ranges. This can result in the shared character > being rendered from a different subset than the surrounding characters. As > a result shaping doesn't work correctly and you end up with poor rendering. > We've had to work around this problem by releasing indic scripts as > separate families which is non-optimal for end users (for example see: > https://fonts.google.com/?query=Baloo) > - Another example of this problem is with latin punctuation in Latin > and Cyrillic fonts. Say we have a single family with Latin and Cyrllic > characters. We want to have cyrillic and latin in their own subsets so we > don't waste bytes downloading cyrllic on latin only pages and vice versa. > However if you have cyrillic text that uses the period "." from the latin > subset then kerning rules between the cyrllic characters and the . no > longer work. Not quite as disastrous as the indic example but still results > in imperfect rendering. > > Section 3.4 > > Not sure what level of detail you want to go into on the specifics of the > byterange approach but there's a couple of points that might be worth > mentioning: > > - For byte range to work fonts must be preprocessed to flatten > composite glyphs into the resulting outlines and the CFF table must be > desubroutinized. Also the glyf/CFF table need to be moved to the end of the > font if it's not already there. > - Another source of efficiency loss is from being unable to leverage > compression across requests. In a single woff2 font file redundant data > between glyphs compresses out. If under byterange those glyphs are > transferred in separate requests the redundant data is retransmitted. > > Section 3.9 > > - We have a defined wire protocol for Subset and Patch ( > https://docs.google.com/document/d/1DJ6VkUEZS2kvYZemIoX4fjCgqFXAtlRjpMlkOvblSts/edit) > and that protocol is used during the simulations so that we're correctly > accounting for protocol overhead which can be substantial in some cases. > For example CJK augmentation requests that need to specify a large number > of codepoints. However, as noted in the doc this protocol is only meant to > be a stand in for size estimation. We will want to rewrite the protocol for > standardization. > > > On Mon, Sep 14, 2020 at 8:48 AM Chris Lilley <chris@w3.org> wrote: > >> Hi folks, >> >> I just committed a first draft of the Evaluation Report. >> >> I mainly concentrated on easy to understand introductory material, >> explaining the problem to be solved and why it is important. Bearing in >> mind that the primary audience are not necessarily familiar with fonts >> in general. >> >> Also, as a group we do not have a final analysis or any conclusions >> decided, so those sections are simply blank. >> >> But it gives an outline of how the report could look, so we can discuss >> the overall structure and approach at least. >> >> https://w3c.github.io/PFE-analysis/report/evaluation-report.html >> >> Reading over it just before the call, I think it actually needs a whole >> new section that explains what OpenType is, sfnt table structure, and >> how rendering a single glyph can depend on data scattered over several >> tables. But I wanted to discuss that on the call before starting to add >> it. I'm thinking again of an introductory and probably diagram-heavy >> exposition. And this will explain in turn why the byterange approach has >> to concentrate on a single table rather than lots of little tables. >> >> -- >> Chris Lilley >> @svgeesus >> Technical Director @ W3C >> W3C Strategy Team, Core Web Design >> W3C Architecture & Technology Team, Core Web & Media >> >> >>
Received on Tuesday, 29 September 2020 22:59:25 UTC