Re: "Complex" glyph substitution from Garret Rieger on 2025-12-02 (public-webfonts-wg@w3.org from December 2025)

From: Garret Rieger <grieger@google.com>
Date: Mon, 1 Dec 2025 17:49:30 -0700
To: Skef Iterum <siterum@adobe.com>
Cc: "public-webfonts-wg@w3.org" <public-webfonts-wg@w3.org>
Message-ID: <CAM=OCWaLvbpaKsofXr47=2kbAAjRwn+-7C7wNyO-M5ECHDf7Tw@mail.gmail.com>

On Tue, Nov 11, 2025 at 7:59 PM Skef Iterum <siterum@adobe.com> wrote:

> At the TPAC WFWG meeting we discussed one difficult aspect of the
> glyph-keyed patch encoding problem: glyphs with more “complex” substitution
> patterns that the encoder punts on and just includes in the initial font. I
> mentioned that this was a significant problem for certain fonts when I was
> working on the IFTB prototype, which raised the question of whether I
> remembered what any of those fonts were.
>
>
>
> Short answer: I don’t.
>
>
>
> However, I was looking through some old slides and they do mention a
> pattern, which was that big aalt and nalt features tended to cause
> problems. I also recall issues with vertical layout in Japanese. So these
> are the things I recommend looking at first.
>

Sounds good, I can look through the open source Google Fonts collection and
see if I can find some fonts which have either aalt, nalt, or vertical
layout which might be good test cases.

>
>
> More generally, I think it would not take too long to build a simple
> evaluator using the tree built with my draft depend branch of HarfBuzz:
> https://github.com/skef/harfbuzz/tree/depend . Maybe that’s not much
> different from just encoding each of the relevant fonts and seeing how
> large the list of punts is, but evaluation code could be more specific
> about what is causing the problems without having to go in and look by
> hand.
>
>
>
> I’m also probably in a position where I can run any or all of the fonts in
> Adobe’s library through an encoder, and while I obviously wouldn’t be able
> to share the actual fonts freely, I could characterize the issue and in
> some cases check with the foundry about providing limited access for
> research purposes.
>

This sounds good. My very rough plan is to take the route of using the
existing analysis and set up a small script that can run just the closure
analysis portion of the segmenter on a collection of fonts and report back
the number of fallback glyphs relative to the total number of glyphs for
each font. This should pretty quickly identify any fonts that are currently
handled poorly. Once we have some examples in hand we can do more indepth
analysis (including potentially using the dependency branch) of those to
see where issues are coming up. I was going to initially run that against
the Google Fonts collection, and if you're able to also run it against the
Adobe collection that would be very helpful.

During the last few days of the conference before I left for vacation I
implemented an early prototype of a detector that can find for each
fallback glyph the list of segments that make up the glyph's composite
condition. For example for a glyph with a composite condition of (A and B)
OR (C and D), it can find the set {A, B, C, D}. With this in hand we have
two options:

   - Assign the glyph a super set condition which is a disjunction across
   the segments in the set (in the previous example this would be A OR B OR C
   OR D). This condition will always match at least when the true condition
   would (over matching is functionally fine, just less efficient than the
   true condition).
   - Use the set to reduce the scope of a more complex analysis that finds
   the true condition.

It's still early stages, but in the few fonts I tested on it was able to
successfully classify all fallback glyphs, so that's a pretty promising
start. Still needs some work, and I'll share some more details soon once I
have some time to put together a writeup on the approach.

>
>
> Skef
>

Received on Tuesday, 2 December 2025 00:49:55 UTC