Re: Some More Thoughts on Adding Encoder Requirements from Garret Rieger on 2024-02-08 (public-webfonts-wg@w3.org from February 2024)

From: Garret Rieger <grieger@google.com>
Date: Wed, 7 Feb 2024 18:31:20 -0700
To: Skef Iterum <siterum@adobe.com>
Cc: "w3c-webfonts-wg (public-webfonts-wg@w3.org)" <public-webfonts-wg@w3.org>
Message-ID: <CAM=OCWZhd8LkiAMJ9NdVKm_7uvcCN6Mies3W8=cdpfwe+7etMw@mail.gmail.com>
I should probably clarify my position. I don't mind talking about the
encoder in terms of an original file in non-normative ways in the
specification, and I plan to do that in the explanatory sections I'll be
adding (including the one that provides guidance on encoding). I
specifically want to avoid adding a normative requirement that requires an
encoder to only do transformations of existing fonts, because that will
actually restrict the ability to create a specification compliant encoder
which does not transform an existing font. Using the brotli specification
as an example, it follows the same approach. The normative
requirements (section
1.4 <https://datatracker.ietf.org/doc/html/rfc7932#section-1.4>) only
require that a compressor produce a valid stream. Section 11 which talks
about the original file is specifically called out as non-normative at the
start.

In the upcoming sections among other things, I plan to strongly encourage
that an encoder should preserve all of the original font functionality. It
will probably be helpful for me to get the new sections written and then I
can check back in and see if that helps address the concerns you have with
the current draft.

On Wed, Feb 7, 2024 at 5:30 PM Skef Iterum <siterum@adobe.com> wrote:

> I think my view on the suggested approach is best illustrated by my
> disagreement with how you describe the Brotli specification when it comes
> to the sort of editorial directions we are discussing. The initial summary
> paragraph starts with the phrase "This specification defines a lossless
> compressed data format  ...", and section 11 is all in terms of an original
> file. If that specification were written according to how we've been
> discussing things, "lossless" would be taken out of that summary sentence
> and section 11 would probably be removed or reorganized because both imply
> that there's an original file to be reproduced, which is a restrictive way
> of looking at the Brotli format. Why think there needs to be an original
> file at all?
>
> So, that specification is written how I would expect it to be. Obviously,
> the bulk of the content is about the format and decoding the format, as
> those are the fixed points of the system while the details of the encoder
> are open. But editorially the document is still organized around the
> expected case: compressing an "original" file into the format and then
> decompressing that format to get back the original data. I haven't read the
> whole thing in detail this afternoon but it seems like any more abstract
> uses of the format are more or less left to the imagination.
>
> And I think that's entirely appropriate. Specifications like these should
> be organized around the central case, leaving other cases either to the
> imagination or to brief asides. So yes, I do think we should be discussing
> the closure requirement, for example, in terms of an original file.
> Specifically because putting it in more abstract terms risks people not
> understanding what the requirement actually is, what it is actually
> intended to preserve.
>
> You say:
>
> I don't agree with the assertion that we've made it so complicated that we
> can't explain how to maintain functional equivalence during augmentation:
>
> Let me emphasize: I don't think this either. What I think is that the
> editorial direction you prefer is likely to make it so. We have designed a
> system with the primary goal of being able to transfer parts of a font to a
> client while preserving the overall behavior of the font on the client
> side, relative to the content it renders, and you say that talking in terms
> of an "original font" is "somewhat problematic" because it *could* be
> used for other things.
>
> I reiterate what I said in the meeting: If we make the documentation too
> abstract, too removed from the central case, we increase the risk that the
> writer of an encoder will fail to understand or appreciate aspects of that
> central case and fail to reproduce the original behavior. And because we
> have taken an "everything is permitted" editorial approach, we won't be
> able to point to requirements or even "strong guidance" indicating that the
> encoder is not written as it should be. And if things go that way people
> might rightly conclude that the system is not reliable because it isn't
> reliable in practice, and it won't be widely adopted.
>
> Skef
> ------------------------------
> *From:* Garret Rieger <grieger@google.com>
> *Sent:* Wednesday, February 7, 2024 3:08 PM
> *To:* Skef Iterum <siterum@adobe.com>
> *Cc:* w3c-webfonts-wg (public-webfonts-wg@w3.org) <
> public-webfonts-wg@w3.org>
> *Subject:* Re: Some More Thoughts on Adding Encoder Requirements
>
>
> *EXTERNAL: Use caution when clicking on links or opening attachments.*
>
>
>
>
> On Tue, Feb 6, 2024 at 4:51 PM Skef Iterum <siterum@adobe.com> wrote:
>
> A few quick thoughts on this:
>
>
>    1. Is there some reason we couldn't treat the any potential
>    "pre-subsetting" of the original font as a separate step *conceptually*?
>    That is, couldn't we describe the behavior of the IFT-encoded font in
>    relation to the "original font", while acknowledging that one might have
>    some reason to subset a given actual font in a given situation before
>    encoding it? Even if some encoder has that functionality built in I don't
>    think we have to account for it in the spec.
>
>
> This is effectively equivalent to framing the equivalence requirement
> around the fully expanded ift font, which could be a subset of some
> original font. However, using an original font is more restrictive because
> it requires there to be an original font. That creates the assumption that
> IFT is an encoding process that is done only to existing fonts. I'm
> advocating that we treat an IFT font + patches as a self contained font
> that could have been produced by any number of different means including
> directly from authoring sources or by transforming an existing font.
> Essentially we start from the point that an IFT font exists, and don't care
> about how it was produced. Once we have an existing IFT font we impose
> certain requirements on it (and in turn the process that produced it) to
> ensure it will behave consistently under the application of the client side
> algorithms.
>
>
>    1. The closure requirement section in the current IFTB docs are part
>    of the explanation of how to build an IFTB encoder: "you need to solve this
>    problem when putting glyphs in bins". Just saying that the general behavior
>    of the font needs to be preserved in relation to something *somehow* is
>    not, in my opinion, close to enough guidance. So one way or another we're
>    going to have to figure out how to add a section with very close to that
>    sort of information, including the "collapse bins, duplicate, or put into
>    bin 0" stuff and what those decisions need to preserve.
>
>
> I'm planning on including non-normative guidance on how an encoder should
> function in the new encoder section, this email was meant to primarily
> discuss the normative requirements we want to include in that section in
> addition to non-normative guidance.
>
>
>    1. If I were on the W3C review committee and saw the document we seem
>    to be heading towards, I think I would say something like "If you've made
>    the spec so complicated and/or flexible that you aren't able to concisely
>    explain what the encoding process needs to do to preserve a font's behavior
>    on the client side, maybe you've made the spec too complicated and/or
>    flexible."
>
>
> It's not a specifications job to explain how something should be done. The
> spec exists primarily to enable interoperability of different
> implementations. As such the main focus will be on the details critical to
> interoperability, that's why there's such a heavy focus on the client side
> in the current draft. If the client side behaviour is well defined, then
> encoders can predict exactly how the client will operate given a particular
> encoding. This allows an encoder to generate an encoding that produces
> whatever desired outcome the encoder is targetting. A good example of this
> is the brotli specification (https://datatracker.ietf.org/doc/html/rfc7932)
> it describes how the brotli encoding format works and how to decompress it,
> but provides no guidance on how to actually build a high quality compressor
> implementation which is a significantly complicated undertaking (see:
> https://datatracker.ietf.org/doc/html/rfc7932#section-11).
>
> I don't agree with the assertion that we've made it so complicated that we
> can't explain how to maintain functional equivalence during augmentation:
> 1. We already have two prototype encoder implementations (the IFTB one,
> and the one I wrote) that demonstrate how to maintain the functional
> equivalence property in an encoder implementation. These two prototype
> encoders are also good quality (with room for improvement of course) as
> they've already demonstrated smaller transfer costs than existing font
> loading methods. We will also want to write some documentation outside of
> the specification that discusses in much more detail how to approach
> building an encoder based on what we've learned developing the prototypes.
> This could be referenced from the spec as a non-normative reference.
> 2. A simple but functional encoder can be demonstrated using only a font
> subsetter and some basic recursive logic (plus serialization logic for the
> IFT table, but that's straightforward). Several high quality open source
> font subsetters exist, and font subsetting at this point is a well
> understood problem.
>
> I do plan to add some guidance around how to approach an encoder to the
> specification as well as more explanation of how the pieces all fit
> together and operate as I think that will help readers with understanding
> how the technology is meant to function.
>
>
> Skef
> ------------------------------
> *From:* Garret Rieger <grieger@google.com>
> *Sent:* Tuesday, February 6, 2024 3:06 PM
> *To:* w3c-webfonts-wg (public-webfonts-wg@w3.org) <
> public-webfonts-wg@w3.org>
> *Subject:* Some More Thoughts on Adding Encoder Requirements
>
>
> *EXTERNAL: Use caution when clicking on links or opening attachments.*
>
>
> Following discussions at the last wg call I’ve been thinking about what
> requirements we may want to place on the encoder/encoded fonts. Here’s what
> I’ve come up with.
>
> First it helps to look at the requirements that previous IFT proposals
> have imposed on the server side/encoder side of the technology.
>
>
>    -
>
>    Binned Incremental Font Transfer: this approach uses whats called the
>    “closure” requirement:
>
>    “The set of glyphs contained in the chunks loaded through the GID and
>    feature maps must be a superset of those in the GID closure of the font
>    subset description.”
>
>    -
>
>    Patch Subset: this approach uses a rendering equivalence requirement:
>
>    “When a subsetted font is used to render text using any combination of
>    the subset codepoints, layout features
>    <https://docs.microsoft.com/en-us/typography/opentype/spec/featuretags#>,
>    or design-variation space
>    <https://docs.microsoft.com/en-us/typography/opentype/spec/otvaroverview#terminology> it
>    must render identically to the original font. This includes rendering with
>    the use of any optional typographic features that a renderer may choose to
>    use from the original font, such as hinting instructions.”
>
>
>
> In thinking it through I don’t think we’ll be able to use the glyph
> closure requirement from IFTB in the new IFT specification. The closure
> requirement in the context of the new IFT approach is too narrow. To be
> correct it assumes glyph ids, outlines, and non outline data are stable.
> However, under the new IFT proposal glyph ids, outlines, and all other data
> in the font are allowed to change while still maintaining functional
> equivalence to the original font.
>
> Ultimately what we want is a requirement that ensures the intermediate
> subsets are functionally equivalent to some font. Which naturally leads to
> something like the rendering equivalence requirement used in patch subset.
> However, this also has some issues which I’ll discuss later.
>
> The next thing we need to define is equivalence to what? The two previous
> approaches define equivalence to the “original font”. However, I think this
> is somewhat problematic:
>
>    -
>
>    It’s quite likely that most encoder implementations will want to
>    optionally perform some amount of subsetting on the input font as part of
>    generating the encoding. The current prototype encoder does exactly that.
>    The IFTB draft spec states that any subsetting operations must be done
>    prior to invoking the encoder, but I think forcing those to be discrete
>    operations artificially limits encoder implementations.
>    -
>
>    In some cases there may not actually be an “original font”. Consider
>    the case where IFT encoding is built directly into a font editor or font
>    compilation tool chain. You could be going directly from font authoring
>    sources to an IFT encoded font.
>
>
> Instead we should look at the IFT font and collection of patches as
> logically being a font, which may be derived from and equivalent to some
> original non-incremental font, or subset of that original font. From the
> IFT base font and collection of patches we can produce the fully expanded
> font that it represents using the extension algorithm
> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> with
> the input subset definition having sets that match all things. The
> iteration will eventually terminate and end with a font that is fully
> expanded and no longer incremental.
>
> I suggest that we use this fully expanded font as the basis from which to
> require equivalence. Specifically, we could require that the font produced
> after any application of the extension algorithm
> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> is
> a subset (as defined here
> <https://garretrieger.github.io/IFT/Overview.html#font-subset-info>) of
> the fully expanded font.
>
> In addition, we want the IFT augmentation process to be consistent: when
> utilizing dependent patches the extension algorithm
> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> allows
> the order of application of dependent patches to be selected by the
> implementation. We should require that all possible orderings of patch
> applications should produce the same end result. The current definition of
> the extension algorithm
> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> already
> implicitly makes this assumption.
>
> Concretely here’s what I’m considering adding to the current draft:
>
>
>    1.
>
>    Define how to fully expand an IFT font (can also re-use this
>    definition in the offline section).
>    2.
>
>    Add a section which provides guidance and requirements for encoder
>    implementations. Add that for any valid subset definition after the
>    application of the extension algorithm
>    <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> on
>    an IFT font:
>    1.
>
>       The encoding MUST be consistent: the result font must be equal
>       (binary equivalence) regardless of the ordering of patch application chosen
>       during the execution of the extension algorithm.
>       2.
>
>       The result SHOULD (strongly encouraged) render equivalent to the
>       fully expanded font when using content fully covered by the subset
>       definition.
>
>
> I’m proposing we use SHOULD for requirement 2b because it will be
> problematic to define the rendering equivalence in a way that we could
> reasonably conformance test for. The main issue I see is that there are
> many existing font renderers that can in some cases have different
> behaviours (eg. shaping implementations are definitely not consistent).
> Testing against all renderers is obviously infeasible, and specifying one
> specific one to test against is not good either. Also you could end up in a
> case where it isn’t possible to satisfy the requirement on two different
> renderers at the same time (probably unlikely but not something that can be
> ruled out). Lastly, as far as I’m aware there’s no canonical specification
> for how a font is rendered, so we can’t reference that either.
>
> Another option is to use a weaker equivalence requirement based around
> something like codepoint presence in the cmap, but as that would provide no
> useful guarantees as to the end result of rendering I’m not sure it’s
> worthwhile.
>
>
Received on Thursday, 8 February 2024 01:31:45 UTC