- From: Garret Rieger <grieger@google.com>
- Date: Fri, 27 Oct 2023 19:22:44 -0600
- To: Skef Iterum <siterum@adobe.com>
- Cc: "public-webfonts-wg@w3.org" <public-webfonts-wg@w3.org>
- Message-ID: <CAM=OCWbyDzywwXLL9rGrXwOY+G+BgkfLiZQnU3uPN7im2YELFQ@mail.gmail.com>
The approach that I had in mind would be something along the lines of encoding the path along the graph in the id string. Except you don't need to encode the full path, you would only need to identify the current node and the destination node to the server. This is because it doesn't matter how you reach a node, the subset at that node will always be the same. For example if you loaded subset a, then b, then c would be no different than if you loaded subset a, then c, then b. Either path would land you on the same font that is a union of subsets a, b, and c. Given that this is how I envisioned implementing a dynamic version (working off the assumption that the configuration is fixed): 1. You start with a list of subset definitions that the input font will be partitioned across. 2. Each of these is assigned a numeric id (assume this mapping is fixed). 3. The id string for a given patch is then formed by encoding two sets into a binary representation: first the set of id's for partitions that the current file has, second the set of id's to be added. This could be done using SparseBitSet's (or some other binary encoding of a set of integers). The binary encoding is then run through base64 to produce a url safe string token that identifies that particular patch. 4. Now your dynamic backend upon receiving a request with a particular id string can reverse the base64 and decode the binary encoding to reconstruct the two sets. This gives it all the information it needs to produce two subsets: one that matches what the client currently has and one that is an extended version. From there the shared brotli patch can be created. In this model the patch would also update the IFT table in the font, and in particular would replace all of the id strings to reflect the change to the current subset. An important property of this setup is that at no point do we have to calculate the full graph ahead of time. The graph emerges dynamically as you start walking it. This all probably sounds pretty familiar because it essentially acts like a simplified version of the fully dynamic patch subset approach. To give a concrete example let's say we have a font and want to partition into 4 subsets: latin, greek, cyrillic, vietnamese. The root contains latin and we assign the subsets numeric ids: latin -> 0 greek -> 1 cyrillic -> 2 vietnamese -> 3 Inside the IFT mapping table of the base font will have three patches listed. The mapping from subset def to ids would be: greek -> [{0}, {1}] cyrillic -> [{0}, {2}] vietnamese -> [{0}, {3}] The client wanting to add cyrillic to it's font sends a request to a url containing the cyrllic id string [{0}, {2}]. The server can decode that id and from it cut two subsets: one that contains latin and one that contains latin and cyrillic. Too the second subset an IFT table is added with updated mappings: greek -> [{0,2}, {1}] vietnamese [{0,2}, {3}] Finally the server computes the binary diff between these two subsets and returns that to the client. In this example it would also be possible for the IFT patch mapping to contain combinations of subsets. For example: greek -> [{0}, {1}] cyrillic -> [{0}, {2}] vietnamese -> [{0}, {3}] greek + cyrillic -> [{0}, {1, 2}] greek + vietnamese -> [{0}, {1, 3}] cyrillic + vietnamese -> [{0}, {2, 3}] greek + cyrillic + vietnamese -> [{0}, {1, 2, 3}] Would allow the client to jump to any combination of subsets as the next step. On Tue, Oct 24, 2023 at 12:46 PM Skef Iterum <siterum@adobe.com> wrote: > As today's discussion is sinking in there's one thing I'm curious about: > > With static IFT under the new proposal the encoder will take the font file > and some configuration and arrive at a patch graph, perhaps all at once and > perhaps step by step but either way (I presume) starting from the root. In > that model the URL for each patch file is just a token embedded in two > types of place (the map in the source file, the name of the target file). > So they could be picked at random for all it matters. > > With the new proposal for dynamic IFT the "target file" won't exist. That > means that the URL in that case needs to map, somehow, to a pair of > parameter sets (codepoints, features, axes): the parameter set of the > source file, and the parameter set of the target file. > > Assuming the configuration is fixed, and therefore the graph for a given > file will be deterministic, one way to do this is to drive the URL has a > hash of the two parameter sets and then walk the whole graph on each > request to find the right node. With a lot of nodes this might be costly. > > Alternatively, the URL could encode a path along the graph, and then you > would just need to generate and walk those particular nodes, assuming > that's possible. > > A third option is to require that the encoder generate and output the > entire graph even in any partially or fully dynamic use case, and then the > server side could consult the file to get the mapping (with the storage > format presumably optimized for this). The map might be large but if it's > only on the server side that's probably not of much importance. > > So what I'm wondering is which of these strategies is the current > thinking, or is there some better option? > > Skef >
Received on Saturday, 28 October 2023 01:23:09 UTC