- From: Frédéric WANG <fred.wang@free.fr>
- Date: Sun, 18 Oct 2015 22:19:36 +0200
- To: www-font@w3.org
- Message-ID: <5623FED8.5050202@free.fr>
Dear WebFonts Working Group, Please find below my personal comments after a first reading of the WOFF 2.0 specification (without any attempts to implement it). I hope they will be helpful. Cheers, Frédéric Wang > The input font file may contain a number of various font data tables > described in the clause 5 of the [OFF] specification. Could you please update the reference link to use edition 3 of the Open Font Format specification now that it is released? Especially since the new MATH table is mentioned in the "Known Table Tags". > File header with basic font type and version, along with offsets to > metadata and private data blocks. > An optional block of extended metadata, represented in XML format and > compressed for storage in the WOFF2 file. There are new lines after "private" and after "compressed". I'm not sure whether it's on purpose. > The pseudo-code describing how to read the 255UInt16 format is > presented below: I think it should be mentioned that this is "C-like" pseudo-code to make clear that bitwise operators <<, & are used. In general maybe it should probably be said somewhere in the introduction that the "0x..." notations used everywhere in this specification correspond to hexadecimal values. > An encoder should choose shorter encodings, and must be consistent in > choice of encoding for the same value, as this will tend to compress > better. I have the impression that the specification should explicitly provide a way to get a shortest encoding. For example some pseudo code describing the following transformation: 0 ≤ value < lowestUCode ---> [value on 1 byte] lowestUCode ≤ value < 2*lowestUCode ---> [oneMoreByteCode1][value - lowestUCode on 1 byte] 2*lowestUCode ≤ value < 2*lowestUCode + 256 ---> [oneMoreByteCode2][value - 2*lowestUCode on one byte] value ≥ 2*lowestUCode + 256 ---> [wordCode][value on 2 bytes] Actually, this is probably insignificant but I wonder why [oneMoreByteCode2][code] is not interpreted as lowestUCode + 256 + code so that code1 & code2 encodings won't overlap. Does the encoding proposed in the spec tend to compress better? > Thus, a decoding procedure for a UIntBase128 is: start with value = 0. > Consume a byte, setting value = old value times 128 + (byte > bitwise-and 127). Repeat last step until the most significant bit of > byte is false. I personally feel that this paragraph would be better presented with pseudo-code. > An encoder must not allow this to happen and must produce shortest > possible encoding. Here too I believe there should be some simple pseudo-code to explain the canonical & optimal way to write the 32bits of the integer on at most five 7-bit blocks (even if that's obvious once you think about it). > The interpretation of the WOFF2 Header is the same as the WOFF Header > in [WOFF1]. Maybe it should be highlighted that the only new field with respect to WOFF version 1 is totalCompressedSize? > The font directory section consist of consists > Whether a table tag is encoded with a known table tag or explicitly > including the four-byte tag has no semantic significance; it is simply > a choice of encoding intended to improve compression efficiency. > The known table flag values should not be relied upon in determining > the presence of the transformed tables, it is feasible that e.g. the > glyf table can be represented in the table directory with either flag > = 10 and no tag, or with flag = 63 and 'glyf' tag that follows. I wonder why the encoder is not forced to use the known flag when it is available for a table? Of course the size gain is negligible and the table directory is not compressed anyway, but that would be consistent with other places of the spec where it tries to get the optimal encoding when possible. > if the table is transformed, then the version number of the applied > transform is defined by the two most significatn flag bits significant > The decompressed and reconstructed table data MUST be stored in the > format specified by the [OFF] specification. Each reconstructed table > directory entry MUST contain a valid 'checkSum' value, the decoder > MUST recalculate the checkSum value for each decoded table. Also, due > to modifying transforms applied to glyf and loca tables, the decoder > MUST recalculate the checkSumAdjustmentglyf value of the entire font > and MUST store the updated value in the head table. "checkSumAdjustmentglyf" should be "checkSumAdjustment", I guess. I find that this paragraph is a bit confusing. It seems to suggest that only the checkSumAdjustment may change and that it is only 'due to modifying transforms applied to glyf and loca tables'. At the same time it seems to ask the decoders to recalculate all the checksums without any verification for possible unchanged checksums. My understanding is that only transformed tables that can not be reconstructed identically to the original will have the checksum invalidated (i.e. 'glyf' and 'loca' but not 'hmtx' or other tables). And that in general the overall checkSumAdjustment will be invalidated. Is that correct? The possible binary changes are better explained in the "The WOFF 2.0 transformations applied to certain tables...." paragraph below that one, maybe this section should be reordered a bit or this decoder+checksum paragraph should point to the paragraph below? Why isn't the decoder asked to verify the checksum that are guaranteed to be preserved and to just recalculate everything? Is it a requirement of the OFF specification when bit 11 of the 'flags' field of the head is set? > The total number of bytes in bboxBitmap is equal to 4 * ((numGlyphs + > 31) / 32). Again, C integer-division is implicit here. I would suggest writing "4 * ⌊ (numGlyphs + 31) / 32 ⌋" or even better if you want to enhance rendering in some web engines and assistive technologies: <math><semantics><mrow><mn>4</mn><mrow><mo>⌊</mo><mfrac><mrow><mtext>numGlyphs</mtext><mo>+</mo><mn>31</mn></mrow><mn>32</mn></mfrac><mo>⌋</mo></mrow></mrow><annotation encoding="text/plain">4 * ⌊ (numGlyphs + 31) / 32 ⌋</annotation></semantics></math> 4⌊numGlyphs+3132⌋4 * ⌊ (numGlyphs + 31) / 32 ⌋ > Upon reading the Transformed glyf Table , the decoding process > iterates one glyph at a time. For each glyph, it reads zero or more > bytes from each of the streams referenced in the Transformed glyf Table . There are extra spaces before the comma and the period. > FLAG_MORE_COMPONENTS bit (1 << 5), FLAG_WE_HAVE_INSTRUCTIONS bit (1 << 8) Note that the Open Font Format (and TTF and OpenType) specifications use MORE_COMPONENTS and WE_HAVE_INSTRUCTIONS (without the FLAG_ prefix). Again, C bit shifting operator is used without prior mention. As a comparison, the font specifications use "bit 0", "bit 1" etc > If the hmtx table transform is both applicable and desired, the > encoder MUST check that leftSideBearing values match the xMin values > of the glyph bounding box for every glyph in a font file and, if the > conditions are met for each of the proportional or monospaced glyph > runs the encoder MUST set hmtx transform version number to "1", MUST > eliminate the corresponding array from the hmtx table and MUST set the > appropriate Flags bits. I guess "leftSideBearing" must be understood as lsb[] and/or leftsideBearing[] ; and that "desired" means "desired by the encoder".
Received on Sunday, 18 October 2015 20:20:09 UTC