- From: Garret Rieger <grieger@google.com>
- Date: Wed, 14 Feb 2024 18:11:59 -0700
- To: Skef Iterum <siterum@adobe.com>
- Cc: "w3c-webfonts-wg (public-webfonts-wg@w3.org)" <public-webfonts-wg@w3.org>
- Message-ID: <CAM=OCWbd201L0CzD=xo+dHYJmXxdb3Xu35NB6gF-DHTCTq0hOw@mail.gmail.com>
Alright, I've updated the draft and incorporated parts of the text you proposed: https://garretrieger.github.io/IFT/Overview.html#encoder On Thu, Feb 8, 2024 at 6:25 PM Garret Rieger <grieger@google.com> wrote: > I've updated the draft with a rough first pass of what I was thinking > here: https://garretrieger.github.io/IFT/Overview.html#encoder. I think > it conveys a similar idea to your text, but I also like the approach in > your text as well. I think I'll iterate on this some more and try to work > in what you've provided as well. > > Also unrelated to the encoder requirements, I've begun to fill in an > encoder considerations section, but it's still currently incomplete and > in-progress. > > On Wed, Feb 7, 2024 at 7:43 PM Skef Iterum <siterum@adobe.com> wrote: > >> I suspect I'll advocate for the encoder section to be "contingently >> normative", starting with something along the lines of the following (off >> the top of my head): >> >> This section describes the requirements on a *conforming* encoding of an >> existing font file. When the encoding of a font file is conforming, and a >> client is implemented according to the other sections of this document, the >> intent of the IFT specification is that appearance and behavior of the font >> in the client will be the same as if the entire file were transferred to >> the client. Any discrepancy in the appearance and behavior of a conforming >> encoding of a font on a correctly implemented client can therefore be >> considered a defect of this specification. >> >> Nothing about these requirements on encoding conformance is meant to rule >> out or deprecate the possibility and practical use of *non-conforming* encodings. >> Any encoding meeting (the requirements of sections X, Y, Z) is valid and >> may have an appropriate use. Under some circumstances it might be desirable >> for an encoded font to omit support for some codepoints from all of its >> patch files even if those were included in the original font file. In other >> cases a font might be directly encoded in the IFT format from source files. >> Encoding is described in terms of conformance requirements for two reasons: >> >> 1. A primary goal of the IFT specification is that the IFT format and >> protocol can serve as a neutral medium for font transfer, comparable to >> WOFF2. A foundry or other rights-owner of a font should be confident that >> the conformant encoding and transfer of that font using IFT will not change >> its behavior and therefore the intent of the font's creators. Licenses or >> contracts might then include requirements about IFT conformance, and >> situations in which reencoding a font in WOFF2 format is de facto >> permissible due to its content-neutrality might also permit conformant IFT >> encoding of that font. >> 2. Describing encoding in terms of conformance and an original file >> also helps clarify other encoding cases. Even when one does not wish to >> preserve the full functionality of a font, or one does not have an original >> font file to preserve, it is often easier to think in terms of a >> theoretical font file to be encoded, and how to preserve the "behavior" of >> that "file". >> >> >> If we don't have something comparable to this, and then a set of actual >> requirements that live up to it, I worry that the neutrality point will get >> lost in a sea of "shoulds" of varying force. >> >> If this sketch helps somehow in the next phase of the drafting, so much >> the better. >> >> Skef >> ------------------------------ >> *From:* Garret Rieger <grieger@google.com> >> *Sent:* Wednesday, February 7, 2024 5:31 PM >> *To:* Skef Iterum <siterum@adobe.com> >> *Cc:* w3c-webfonts-wg (public-webfonts-wg@w3.org) < >> public-webfonts-wg@w3.org> >> *Subject:* Re: Some More Thoughts on Adding Encoder Requirements >> >> >> *EXTERNAL: Use caution when clicking on links or opening attachments.* >> >> >> I should probably clarify my position. I don't mind talking about the >> encoder in terms of an original file in non-normative ways in the >> specification, and I plan to do that in the explanatory sections I'll be >> adding (including the one that provides guidance on encoding). I >> specifically want to avoid adding a normative requirement that requires an >> encoder to only do transformations of existing fonts, because that will >> actually restrict the ability to create a specification compliant encoder >> which does not transform an existing font. Using the brotli specification >> as an example, it follows the same approach. The normative requirements (section >> 1.4 <https://datatracker.ietf.org/doc/html/rfc7932#section-1.4>) only >> require that a compressor produce a valid stream. Section 11 which talks >> about the original file is specifically called out as non-normative at the >> start. >> >> In the upcoming sections among other things, I plan to strongly encourage >> that an encoder should preserve all of the original font functionality. It >> will probably be helpful for me to get the new sections written and then I >> can check back in and see if that helps address the concerns you have with >> the current draft. >> >> On Wed, Feb 7, 2024 at 5:30 PM Skef Iterum <siterum@adobe.com> wrote: >> >> I think my view on the suggested approach is best illustrated by my >> disagreement with how you describe the Brotli specification when it comes >> to the sort of editorial directions we are discussing. The initial summary >> paragraph starts with the phrase "This specification defines a lossless >> compressed data format ...", and section 11 is all in terms of an original >> file. If that specification were written according to how we've been >> discussing things, "lossless" would be taken out of that summary sentence >> and section 11 would probably be removed or reorganized because both imply >> that there's an original file to be reproduced, which is a restrictive way >> of looking at the Brotli format. Why think there needs to be an original >> file at all? >> >> So, that specification is written how I would expect it to be. Obviously, >> the bulk of the content is about the format and decoding the format, as >> those are the fixed points of the system while the details of the encoder >> are open. But editorially the document is still organized around the >> expected case: compressing an "original" file into the format and then >> decompressing that format to get back the original data. I haven't read the >> whole thing in detail this afternoon but it seems like any more abstract >> uses of the format are more or less left to the imagination. >> >> And I think that's entirely appropriate. Specifications like these should >> be organized around the central case, leaving other cases either to the >> imagination or to brief asides. So yes, I do think we should be discussing >> the closure requirement, for example, in terms of an original file. >> Specifically because putting it in more abstract terms risks people not >> understanding what the requirement actually is, what it is actually >> intended to preserve. >> >> You say: >> >> I don't agree with the assertion that we've made it so complicated that >> we can't explain how to maintain functional equivalence during augmentation: >> >> Let me emphasize: I don't think this either. What I think is that the >> editorial direction you prefer is likely to make it so. We have designed a >> system with the primary goal of being able to transfer parts of a font to a >> client while preserving the overall behavior of the font on the client >> side, relative to the content it renders, and you say that talking in terms >> of an "original font" is "somewhat problematic" because it *could* be >> used for other things. >> >> I reiterate what I said in the meeting: If we make the documentation too >> abstract, too removed from the central case, we increase the risk that the >> writer of an encoder will fail to understand or appreciate aspects of that >> central case and fail to reproduce the original behavior. And because we >> have taken an "everything is permitted" editorial approach, we won't be >> able to point to requirements or even "strong guidance" indicating that the >> encoder is not written as it should be. And if things go that way people >> might rightly conclude that the system is not reliable because it isn't >> reliable in practice, and it won't be widely adopted. >> >> Skef >> ------------------------------ >> *From:* Garret Rieger <grieger@google.com> >> *Sent:* Wednesday, February 7, 2024 3:08 PM >> *To:* Skef Iterum <siterum@adobe.com> >> *Cc:* w3c-webfonts-wg (public-webfonts-wg@w3.org) < >> public-webfonts-wg@w3.org> >> *Subject:* Re: Some More Thoughts on Adding Encoder Requirements >> >> >> *EXTERNAL: Use caution when clicking on links or opening attachments.* >> >> >> >> >> On Tue, Feb 6, 2024 at 4:51 PM Skef Iterum <siterum@adobe.com> wrote: >> >> A few quick thoughts on this: >> >> >> 1. Is there some reason we couldn't treat the any potential >> "pre-subsetting" of the original font as a separate step >> *conceptually*? That is, couldn't we describe the behavior of the >> IFT-encoded font in relation to the "original font", while >> acknowledging that one might have some reason to subset a given actual font >> in a given situation before encoding it? Even if some encoder has that >> functionality built in I don't think we have to account for it in the spec. >> >> >> This is effectively equivalent to framing the equivalence requirement >> around the fully expanded ift font, which could be a subset of some >> original font. However, using an original font is more restrictive because >> it requires there to be an original font. That creates the assumption that >> IFT is an encoding process that is done only to existing fonts. I'm >> advocating that we treat an IFT font + patches as a self contained font >> that could have been produced by any number of different means including >> directly from authoring sources or by transforming an existing font. >> Essentially we start from the point that an IFT font exists, and don't care >> about how it was produced. Once we have an existing IFT font we impose >> certain requirements on it (and in turn the process that produced it) to >> ensure it will behave consistently under the application of the client side >> algorithms. >> >> >> 1. The closure requirement section in the current IFTB docs are part >> of the explanation of how to build an IFTB encoder: "you need to solve this >> problem when putting glyphs in bins". Just saying that the general behavior >> of the font needs to be preserved in relation to something *somehow* is >> not, in my opinion, close to enough guidance. So one way or another we're >> going to have to figure out how to add a section with very close to that >> sort of information, including the "collapse bins, duplicate, or put into >> bin 0" stuff and what those decisions need to preserve. >> >> >> I'm planning on including non-normative guidance on how an encoder should >> function in the new encoder section, this email was meant to primarily >> discuss the normative requirements we want to include in that section in >> addition to non-normative guidance. >> >> >> 1. If I were on the W3C review committee and saw the document we seem >> to be heading towards, I think I would say something like "If you've made >> the spec so complicated and/or flexible that you aren't able to concisely >> explain what the encoding process needs to do to preserve a font's behavior >> on the client side, maybe you've made the spec too complicated and/or >> flexible." >> >> >> It's not a specifications job to explain how something should be done. >> The spec exists primarily to enable interoperability of different >> implementations. As such the main focus will be on the details critical to >> interoperability, that's why there's such a heavy focus on the client side >> in the current draft. If the client side behaviour is well defined, then >> encoders can predict exactly how the client will operate given a particular >> encoding. This allows an encoder to generate an encoding that produces >> whatever desired outcome the encoder is targetting. A good example of this >> is the brotli specification ( >> https://datatracker.ietf.org/doc/html/rfc7932) it describes how the >> brotli encoding format works and how to decompress it, but provides no >> guidance on how to actually build a high quality compressor implementation >> which is a significantly complicated undertaking (see: >> https://datatracker.ietf.org/doc/html/rfc7932#section-11). >> >> I don't agree with the assertion that we've made it so complicated that >> we can't explain how to maintain functional equivalence during augmentation: >> 1. We already have two prototype encoder implementations (the IFTB one, >> and the one I wrote) that demonstrate how to maintain the functional >> equivalence property in an encoder implementation. These two prototype >> encoders are also good quality (with room for improvement of course) as >> they've already demonstrated smaller transfer costs than existing font >> loading methods. We will also want to write some documentation outside of >> the specification that discusses in much more detail how to approach >> building an encoder based on what we've learned developing the prototypes. >> This could be referenced from the spec as a non-normative reference. >> 2. A simple but functional encoder can be demonstrated using only a font >> subsetter and some basic recursive logic (plus serialization logic for the >> IFT table, but that's straightforward). Several high quality open source >> font subsetters exist, and font subsetting at this point is a well >> understood problem. >> >> I do plan to add some guidance around how to approach an encoder to the >> specification as well as more explanation of how the pieces all fit >> together and operate as I think that will help readers with understanding >> how the technology is meant to function. >> >> >> Skef >> ------------------------------ >> *From:* Garret Rieger <grieger@google.com> >> *Sent:* Tuesday, February 6, 2024 3:06 PM >> *To:* w3c-webfonts-wg (public-webfonts-wg@w3.org) < >> public-webfonts-wg@w3.org> >> *Subject:* Some More Thoughts on Adding Encoder Requirements >> >> >> *EXTERNAL: Use caution when clicking on links or opening attachments.* >> >> >> Following discussions at the last wg call I’ve been thinking about what >> requirements we may want to place on the encoder/encoded fonts. Here’s what >> I’ve come up with. >> >> First it helps to look at the requirements that previous IFT proposals >> have imposed on the server side/encoder side of the technology. >> >> >> - >> >> Binned Incremental Font Transfer: this approach uses whats called the >> “closure” requirement: >> >> “The set of glyphs contained in the chunks loaded through the GID and >> feature maps must be a superset of those in the GID closure of the font >> subset description.” >> >> - >> >> Patch Subset: this approach uses a rendering equivalence requirement: >> >> “When a subsetted font is used to render text using any combination >> of the subset codepoints, layout features >> <https://docs.microsoft.com/en-us/typography/opentype/spec/featuretags#>, >> or design-variation space >> <https://docs.microsoft.com/en-us/typography/opentype/spec/otvaroverview#terminology> it >> must render identically to the original font. This includes rendering with >> the use of any optional typographic features that a renderer may choose to >> use from the original font, such as hinting instructions.” >> >> >> >> In thinking it through I don’t think we’ll be able to use the glyph >> closure requirement from IFTB in the new IFT specification. The closure >> requirement in the context of the new IFT approach is too narrow. To be >> correct it assumes glyph ids, outlines, and non outline data are stable. >> However, under the new IFT proposal glyph ids, outlines, and all other data >> in the font are allowed to change while still maintaining functional >> equivalence to the original font. >> >> Ultimately what we want is a requirement that ensures the intermediate >> subsets are functionally equivalent to some font. Which naturally leads to >> something like the rendering equivalence requirement used in patch subset. >> However, this also has some issues which I’ll discuss later. >> >> The next thing we need to define is equivalence to what? The two previous >> approaches define equivalence to the “original font”. However, I think this >> is somewhat problematic: >> >> - >> >> It’s quite likely that most encoder implementations will want to >> optionally perform some amount of subsetting on the input font as part of >> generating the encoding. The current prototype encoder does exactly that. >> The IFTB draft spec states that any subsetting operations must be done >> prior to invoking the encoder, but I think forcing those to be discrete >> operations artificially limits encoder implementations. >> - >> >> In some cases there may not actually be an “original font”. Consider >> the case where IFT encoding is built directly into a font editor or font >> compilation tool chain. You could be going directly from font authoring >> sources to an IFT encoded font. >> >> >> Instead we should look at the IFT font and collection of patches as >> logically being a font, which may be derived from and equivalent to some >> original non-incremental font, or subset of that original font. From the >> IFT base font and collection of patches we can produce the fully expanded >> font that it represents using the extension algorithm >> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> with >> the input subset definition having sets that match all things. The >> iteration will eventually terminate and end with a font that is fully >> expanded and no longer incremental. >> >> I suggest that we use this fully expanded font as the basis from which to >> require equivalence. Specifically, we could require that the font produced >> after any application of the extension algorithm >> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> is >> a subset (as defined here >> <https://garretrieger.github.io/IFT/Overview.html#font-subset-info>) of >> the fully expanded font. >> >> In addition, we want the IFT augmentation process to be consistent: when >> utilizing dependent patches the extension algorithm >> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> allows >> the order of application of dependent patches to be selected by the >> implementation. We should require that all possible orderings of patch >> applications should produce the same end result. The current definition of >> the extension algorithm >> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> already >> implicitly makes this assumption. >> >> Concretely here’s what I’m considering adding to the current draft: >> >> >> 1. >> >> Define how to fully expand an IFT font (can also re-use this >> definition in the offline section). >> 2. >> >> Add a section which provides guidance and requirements for encoder >> implementations. Add that for any valid subset definition after the >> application of the extension algorithm >> <https://garretrieger.github.io/IFT/Overview.html#extending-font-subset> on >> an IFT font: >> 1. >> >> The encoding MUST be consistent: the result font must be equal >> (binary equivalence) regardless of the ordering of patch application chosen >> during the execution of the extension algorithm. >> 2. >> >> The result SHOULD (strongly encouraged) render equivalent to the >> fully expanded font when using content fully covered by the subset >> definition. >> >> >> I’m proposing we use SHOULD for requirement 2b because it will be >> problematic to define the rendering equivalence in a way that we could >> reasonably conformance test for. The main issue I see is that there are >> many existing font renderers that can in some cases have different >> behaviours (eg. shaping implementations are definitely not consistent). >> Testing against all renderers is obviously infeasible, and specifying one >> specific one to test against is not good either. Also you could end up in a >> case where it isn’t possible to satisfy the requirement on two different >> renderers at the same time (probably unlikely but not something that can be >> ruled out). Lastly, as far as I’m aware there’s no canonical specification >> for how a font is rendered, so we can’t reference that either. >> >> Another option is to use a weaker equivalence requirement based around >> something like codepoint presence in the cmap, but as that would provide no >> useful guarantees as to the end result of rendering I’m not sure it’s >> worthwhile. >> >>
Received on Thursday, 15 February 2024 01:12:25 UTC