RE: PFE challenges to consider

From pragmatic point of view (and I mean it in best possible sense of it because I like pragmatic approach in general) it may well be a good solution, but we also need to be conscious about potential significant redundancy that it entails. For example, some fonts have multiple stylistic sets supported (if I remember correctly, font “Gabriella” has eight of them) and, realistically, only one would be used on page so the built-in redundancy of indiscriminate “glyph closure” could be very significant.


From: Garret Rieger <grieger@google.com>
Sent: Wednesday, July 24, 2019 1:43 PM
To: Levantovsky, Vladimir <Vladimir.Levantovsky@monotype.com>
Cc: mmaxfield@apple.com; w3c-webfonts-wg (public-webfonts-wg@w3.org) <public-webfonts-wg@w3.org>
Subject: Re: PFE challenges to consider

Currently font subsetters handle this problem by computing a "glyph closure". This finds all possible glyphs that are reachable from a set of starting glyphs (derived from the input code points) by the application of any sequence of layout features in the font. All glyphs in the closure are retained in the produced subset. Since the subset and patch transfer method uses a font subsetter underneath the data sent back to the client will include all glyphs that may be needed to render a particular set of codepoints regardless of what layout features are activated client side.

On Wed, Jul 24, 2019 at 10:40 AM Levantovsky, Vladimir <Vladimir.Levantovsky@monotype.com<mailto:Vladimir.Levantovsky@monotype.com>> wrote:
Hi Myles, all,

Please see inline.

From: mmaxfield@apple.com<mailto:mmaxfield@apple.com> <mmaxfield@apple.com<mailto:mmaxfield@apple.com>>
Sent: Tuesday, July 23, 2019 5:24 PM
To: Levantovsky, Vladimir <Vladimir.Levantovsky@monotype.com<mailto:Vladimir.Levantovsky@monotype.com>>
Cc: w3c-webfonts-wg (public-webfonts-wg@w3.org<mailto:public-webfonts-wg@w3.org>) <public-webfonts-wg@w3.org<mailto:public-webfonts-wg@w3.org>>
Subject: Re: PFE challenges to consider



On Jul 23, 2019, at 1:58 PM, Levantovsky, Vladimir <Vladimir.Levantovsky@monotype.com<mailto:Vladimir.Levantovsky@monotype.com>> wrote:

Folks,

I’ve been chatting with one of my colleagues (who is the expert in complex scripts) about our progressive font enrichment project, primarily to figure out what fonts we’d need to use as part of our test set for analysis framework. As I explained to him two different approaches we currently consider, and the overall goals of this project, he made a casual remark during the discussion saying “for best results and highest level of efficiency – make sure you are subsetting the font to output glyphs set, and not just based on input data”.

Can you explain this a bit more thoroughly? What does he mean by “input” and “output”?

<VL> The input is a textual content that is a part of the page content – character strings, character combinations/sequences, CSS font features applied, etc. The output is a set of glyph IDs that is going to be rendered to display the textual content, which is determined after the shaping and layout takes place, and all font features applied. </VL>

This seemingly innocent remark has immediately raised multiple issues we didn’t consider yet (or, at least didn’t verbalize):
- output glyphs can be modified by CSS (think e.g. stylistic sets, smallcaps, glyph alternates, etc.) – a font subset created to support a particular page has to account for this;

These are implemented by font features.

<VL> Yes, but the discretionary features to be applied are often specified by CSS. So, for simple case example, if an initial Latin font subset is created to include all required lowercase and uppercase glyphs that correspond to the list of codepoints provided by a browser, and the CSS calls for small caps feature to be applied – we end up with an initial subset that includes lowercase glyphs we do not need, and is missing small caps glyphs we do need. </VL>

- output glyphs can be modified by a particular rendering mode (e.g. ruby markup in Japanese);

Ruby is implemented either by size/width (which means different selected fonts, or variable fonts) or by font features.

<VL> From what I’ve been told (Ken can correct me if I am wrong) – some ruby markup may call for different base glyphs. </VL>

- output glyphs are subject to shaping / layout rules, we may not always know what they are (even if we know all input character combinations) until the shaping is done, which means the first increment of a particular font has to be loaded to at least support shaping.

For dynamic content, this is certainly true. In general, we haven’t solved the “dynamic content” problem yet at all.

<VL> I don’t think this is true only for dynamic content. </VL>

Consider a set of characters which the browser knows are present on a page. Pretend the browser knows all the shaping rules in the font.

<VL> How can a browser possibly know all the shaping rules in the font without having that particular font. Shaping rules in major parts are defined by the content of GSUB/GPOS/GDEF tables, and even if you have two different fonts covering the same exact script – glyph IDs are likely to be different for the same glyphs, the content of the layout tables will be different, and the output set of glyph IDs that need to be encoded in a font subset to display the same text content will be different from one font to another. I don’t see how we can make an assumption that the browser knows all shaping rules in advance. </VL>


Consider if the browser could compute the set of every “reachable” glyph that any possible sequence of these characters could reference.

<VL> It cannot, in my opinion. Computing reachable glyphs is the process where shaping / layout rules and other font features are applied – you have to have at least a font subset already delivered that gives you that data. </VL>

I wonder, for normal fonts with normal shaping rules, what the relationship between the size of the set of characters and the size of the set of possibly reachable glyphs. If the correlation is roughly linear or sublinear, this likely isn’t a big deal, but if a small set of input characters can potentially reference every glyph in the font, that would be unfortunate.

<VL> I don’t think the concept of a “normal font” is even applicable in this case, for the reasons I previously mentioned. </VL>

I am sure there is more to consider, this is just the tip of an iceberg. As is, these considerations seem to create certain additional challenges for incremental transfer, and also give bit more weight to an alternative approach Myles has suggested, when a browser can ask for the basic subset to start with and incrementally update it based on real needs determined by shaping and CSS.

In both approaches, the browser knows a) which characters are present on the page b) which characters are affected by which styles, and c) which specific sequences of characters it needs to be rendering in each font. Therefore, in both approaches, the browser can decide whether or not to consider styling or shaping information in its requests to the server. So, I don’t think this addition helps us make a distinction between the two approaches.

<VL> Cases a) and b) are true, c) is a controversial subject – the browser will know what character combinations are present in the input but it doesn’t know whether those combinations will be rendered by a sequence of individual glyphs (and which particular glyphs as is the case for e.g. Arabic), or if there is a single glyph that needs to be rendered to display a ligature, or a syllable – this knowledge can only be obtained after shaping is done, and from that point on the browser will not be dealing with character sequences, it will be dealing with glyph IDs.

In one approach, the browser would have to send back to font server everything it knows about a particular input, including information about discretionary features, text spans for which those discretionary features are selected, etc., and basically ask the font server to either do the shaping and determine the optimal font subset to be created, or to create a “superset” that would cover all possible output combinations. And even if this can be done, I am not sure if it’s reasonable – too many things can go wrong.

In another approach, the browser needs an initial subset that provides all necessary metric/layout/shaping data, does the shaping, determines an output set of glyph IDs that are needed to display the input, and asks for an incremental subset that contains outline data for those glyph IDs. What it gets back is an optimal subset containing everything that is needed and nothing is missing. </VL>

It probably means that, whichever solution is picked, the client should be asking the server for particular glyphs, rather than particular characters, because the server doesn’t (shouldn’t) know the styles on the page.

<VL> Exactly my point, and this is why I mentioned that considering the magnitude of possible input variations (languages, font features, …) the approach you proposed [where font data is organized in a particular way and the browser can start by reading what is required for shaping and then amend it with the glyph data] should be given additional consideration because of it.</VL>

You are right, however, that any implementation should be able to have affordances for grappling with this problem, either by considering all the above, or intentionally not considering some of them.


Thoughts?

Thank you,
Vlad

Received on Wednesday, 24 July 2019 18:08:15 UTC