Re: Some More Thoughts on Adding Encoder Requirements from Skef Iterum on 2024-02-06 (public-webfonts-wg@w3.org from February 2024)

From: Skef Iterum <siterum@adobe.com>
Date: Tue, 6 Feb 2024 23:51:34 +0000
To: Garret Rieger <grieger@google.com>, "w3c-webfonts-wg (public-webfonts-wg@w3.org)" <public-webfonts-wg@w3.org>
Message-ID: <CH3PR02MB91394ADF1C9FA62EE2A4C117B9462@CH3PR02MB9139.namprd02.prod.outlook.com>
A few quick thoughts on this:


  1.
Is there some reason we couldn't treat the any potential "pre-subsetting" of the original font as a separate step conceptually? That is, couldn't we describe the behavior of the IFT-encoded font in relation to the "original font", while acknowledging that one might have some reason to subset a given actual font in a given situation before encoding it? Even if some encoder has that functionality built in I don't think we have to account for it in the spec.
  2.
The closure requirement section in the current IFTB docs are part of the explanation of how to build an IFTB encoder: "you need to solve this problem when putting glyphs in bins". Just saying that the general behavior of the font needs to be preserved in relation to something somehow is not, in my opinion, close to enough guidance. So one way or another we're going to have to figure out how to add a section with very close to that sort of information, including the "collapse bins, duplicate, or put into bin 0" stuff and what those decisions need to preserve.
  3.
If I were on the W3C review committee and saw the document we seem to be heading towards, I think I would say something like "If you've made the spec so complicated and/or flexible that you aren't able to concisely explain what the encoding process needs to do to preserve a font's behavior on the client side, maybe you've made the spec too complicated and/or flexible."

Skef
________________________________
From: Garret Rieger <grieger@google.com>
Sent: Tuesday, February 6, 2024 3:06 PM
To: w3c-webfonts-wg (public-webfonts-wg@w3.org) <public-webfonts-wg@w3.org>
Subject: Some More Thoughts on Adding Encoder Requirements


EXTERNAL: Use caution when clicking on links or opening attachments.


Following discussions at the last wg call I’ve been thinking about what requirements we may want to place on the encoder/encoded fonts. Here’s what I’ve come up with.


First it helps to look at the requirements that previous IFT proposals have imposed on the server side/encoder side of the technology.


  *   Binned Incremental Font Transfer: this approach uses whats called the “closure” requirement:

“The set of glyphs contained in the chunks loaded through the GID and feature maps must be a superset of those in the GID closure of the font subset description.”


  *   Patch Subset: this approach uses a rendering equivalence requirement:

“When a subsetted font is used to render text using any combination of the subset codepoints, layout features<https://docs.microsoft.com/en-us/typography/opentype/spec/featuretags#>, or design-variation space<https://docs.microsoft.com/en-us/typography/opentype/spec/otvaroverview#terminology> it must render identically to the original font. This includes rendering with the use of any optional typographic features that a renderer may choose to use from the original font, such as hinting instructions.”



In thinking it through I don’t think we’ll be able to use the glyph closure requirement from IFTB in the new IFT specification. The closure requirement in the context of the new IFT approach is too narrow. To be correct it assumes glyph ids, outlines, and non outline data are stable. However, under the new IFT proposal glyph ids, outlines, and all other data in the font are allowed to change while still maintaining functional equivalence to the original font.


Ultimately what we want is a requirement that ensures the intermediate subsets are functionally equivalent to some font. Which naturally leads to something like the rendering equivalence requirement used in patch subset. However, this also has some issues which I’ll discuss later.


The next thing we need to define is equivalence to what? The two previous approaches define equivalence to the “original font”. However, I think this is somewhat problematic:

  *   It’s quite likely that most encoder implementations will want to optionally perform some amount of subsetting on the input font as part of generating the encoding. The current prototype encoder does exactly that. The IFTB draft spec states that any subsetting operations must be done prior to invoking the encoder, but I think forcing those to be discrete operations artificially limits encoder implementations.

  *   In some cases there may not actually be an “original font”. Consider the case where IFT encoding is built directly into a font editor or font compilation tool chain. You could be going directly from font authoring sources to an IFT encoded font.


Instead we should look at the IFT font and collection of patches as logically being a font, which may be derived from and equivalent to some original non-incremental font, or subset of that original font. From the IFT base font and collection of patches we can produce the fully expanded font that it represents using the extension algorithm<https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> with the input subset definition having sets that match all things. The iteration will eventually terminate and end with a font that is fully expanded and no longer incremental.


I suggest that we use this fully expanded font as the basis from which to require equivalence. Specifically, we could require that the font produced after any application of the extension algorithm<https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> is a subset (as defined here<https://garretrieger.github.io/IFT/Overview.html#font-subset-info>) of the fully expanded font.

In addition, we want the IFT augmentation process to be consistent: when utilizing dependent patches the extension algorithm<https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> allows the order of application of dependent patches to be selected by the implementation. We should require that all possible orderings of patch applications should produce the same end result. The current definition of the extension algorithm<https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> already implicitly makes this assumption.


Concretely here’s what I’m considering adding to the current draft:


  1.  Define how to fully expand an IFT font (can also re-use this definition in the offline section).

  2.  Add a section which provides guidance and requirements for encoder implementations. Add that for any valid subset definition after the application of the extension algorithm<https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> on an IFT font:

     *   The encoding MUST be consistent: the result font must be equal (binary equivalence) regardless of the ordering of patch application chosen during the execution of the extension algorithm.

     *   The result SHOULD (strongly encouraged) render equivalent to the fully expanded font when using content fully covered by the subset definition.


I’m proposing we use SHOULD for requirement 2b because it will be problematic to define the rendering equivalence in a way that we could reasonably conformance test for. The main issue I see is that there are many existing font renderers that can in some cases have different behaviours (eg. shaping implementations are definitely not consistent). Testing against all renderers is obviously infeasible, and specifying one specific one to test against is not good either. Also you could end up in a case where it isn’t possible to satisfy the requirement on two different renderers at the same time (probably unlikely but not something that can be ruled out). Lastly, as far as I’m aware there’s no canonical specification for how a font is rendered, so we can’t reference that either.


Another option is to use a weaker equivalence requirement based around something like codepoint presence in the cmap, but as that would provide no useful guarantees as to the end result of rendering I’m not sure it’s worthwhile.
Received on Tuesday, 6 February 2024 23:51:45 UTC