- From: Garret Rieger <grieger@google.com>
- Date: Tue, 6 Feb 2024 16:06:46 -0700
- To: "w3c-webfonts-wg (public-webfonts-wg@w3.org)" <public-webfonts-wg@w3.org>
- Message-ID: <CAM=OCWYR=827t7agKK+Ffk7UkgyJbBm59Efecb6MMEDgXMJDhg@mail.gmail.com>
Following discussions at the last wg call I’ve been thinking about what requirements we may want to place on the encoder/encoded fonts. Here’s what I’ve come up with. First it helps to look at the requirements that previous IFT proposals have imposed on the server side/encoder side of the technology. - Binned Incremental Font Transfer: this approach uses whats called the “closure” requirement: “The set of glyphs contained in the chunks loaded through the GID and feature maps must be a superset of those in the GID closure of the font subset description.” - Patch Subset: this approach uses a rendering equivalence requirement: “When a subsetted font is used to render text using any combination of the subset codepoints, layout features <https://docs.microsoft.com/en-us/typography/opentype/spec/featuretags#>, or design-variation space <https://docs.microsoft.com/en-us/typography/opentype/spec/otvaroverview#terminology> it must render identically to the original font. This includes rendering with the use of any optional typographic features that a renderer may choose to use from the original font, such as hinting instructions.” In thinking it through I don’t think we’ll be able to use the glyph closure requirement from IFTB in the new IFT specification. The closure requirement in the context of the new IFT approach is too narrow. To be correct it assumes glyph ids, outlines, and non outline data are stable. However, under the new IFT proposal glyph ids, outlines, and all other data in the font are allowed to change while still maintaining functional equivalence to the original font. Ultimately what we want is a requirement that ensures the intermediate subsets are functionally equivalent to some font. Which naturally leads to something like the rendering equivalence requirement used in patch subset. However, this also has some issues which I’ll discuss later. The next thing we need to define is equivalence to what? The two previous approaches define equivalence to the “original font”. However, I think this is somewhat problematic: - It’s quite likely that most encoder implementations will want to optionally perform some amount of subsetting on the input font as part of generating the encoding. The current prototype encoder does exactly that. The IFTB draft spec states that any subsetting operations must be done prior to invoking the encoder, but I think forcing those to be discrete operations artificially limits encoder implementations. - In some cases there may not actually be an “original font”. Consider the case where IFT encoding is built directly into a font editor or font compilation tool chain. You could be going directly from font authoring sources to an IFT encoded font. Instead we should look at the IFT font and collection of patches as logically being a font, which may be derived from and equivalent to some original non-incremental font, or subset of that original font. From the IFT base font and collection of patches we can produce the fully expanded font that it represents using the extension algorithm <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> with the input subset definition having sets that match all things. The iteration will eventually terminate and end with a font that is fully expanded and no longer incremental. I suggest that we use this fully expanded font as the basis from which to require equivalence. Specifically, we could require that the font produced after any application of the extension algorithm <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> is a subset (as defined here <https://garretrieger.github.io/IFT/Overview.html#font-subset-info>) of the fully expanded font. In addition, we want the IFT augmentation process to be consistent: when utilizing dependent patches the extension algorithm <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> allows the order of application of dependent patches to be selected by the implementation. We should require that all possible orderings of patch applications should produce the same end result. The current definition of the extension algorithm <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> already implicitly makes this assumption. Concretely here’s what I’m considering adding to the current draft: 1. Define how to fully expand an IFT font (can also re-use this definition in the offline section). 2. Add a section which provides guidance and requirements for encoder implementations. Add that for any valid subset definition after the application of the extension algorithm <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> on an IFT font: 1. The encoding MUST be consistent: the result font must be equal (binary equivalence) regardless of the ordering of patch application chosen during the execution of the extension algorithm. 2. The result SHOULD (strongly encouraged) render equivalent to the fully expanded font when using content fully covered by the subset definition. I’m proposing we use SHOULD for requirement 2b because it will be problematic to define the rendering equivalence in a way that we could reasonably conformance test for. The main issue I see is that there are many existing font renderers that can in some cases have different behaviours (eg. shaping implementations are definitely not consistent). Testing against all renderers is obviously infeasible, and specifying one specific one to test against is not good either. Also you could end up in a case where it isn’t possible to satisfy the requirement on two different renderers at the same time (probably unlikely but not something that can be ruled out). Lastly, as far as I’m aware there’s no canonical specification for how a font is rendered, so we can’t reference that either. Another option is to use a weaker equivalence requirement based around something like codepoint presence in the cmap, but as that would provide no useful guarantees as to the end result of rendering I’m not sure it’s worthwhile.
Received on Tuesday, 6 February 2024 23:07:10 UTC