- From: Roderick Sheeter <rsheeter@google.com>
- Date: Tue, 3 Nov 2015 08:32:23 -0800
- To: "Levantovsky, Vladimir" <Vladimir.Levantovsky@monotype.com>
- Cc: Frédéric WANG <fred.wang@free.fr>, "www-font@w3.org" <www-font@w3.org>, "w3c-webfonts-wg (public-webfonts-wg@w3.org)" <public-webfonts-wg@w3.org>
- Message-ID: <CABscrrH4ZX-a18YJ6jrTWjUNFaYss-uokGhnPJHkOsE4e=j7jA@mail.gmail.com>
In the line 844/852 diff it looks like something unfortunate happened to the character encoding. The right hand side - 4 * ⌊(numGlyphs + 31) / 32⌋. - looks off. l find (1 << 8) clearer than (bit 8). If we are going to say "bit 8" did we define somewhere how to interpret that or is there a standard interpretation? On Mon, Nov 2, 2015 at 2:18 PM, Levantovsky, Vladimir < Vladimir.Levantovsky@monotype.com> wrote: > Dear Frédéric, > > > > Thank you very much for taking the time to review the WOFF2 spec and > providing your comments, please see the disposition notes inline. The most > recent version of the Editor’s Draft WOFF2 specification is available at > http://dev.w3.org/webfonts/WOFF2/spec/ and the implemented changes are at > http://dev.w3.org/cvsweb/webfonts/WOFF2/spec/Overview.html.diff?r1=1.69;r2=1.70;f=h > > > > With kind regards, > > Vladimir > > > > > > *From:* Frédéric WANG [mailto:fred.wang@free.fr] > *Sent:* Sunday, October 18, 2015 4:20 PM > *To:* www-font@w3.org > *Subject:* Comments on the WOFF 2.0 Working Draft 9 October 2015 > > > > Dear WebFonts Working Group, > > Please find below my personal comments after a first reading of the WOFF > 2.0 specification (without any attempts to implement it). I hope they will > be helpful. > > Cheers, > > Frédéric Wang > > > The input font file may contain a number of various font data tables > described in the clause 5 of the [OFF] specification. > > Could you please update the reference link to use edition 3 of the Open > Font Format specification now that it is released? Especially since the new > MATH table is mentioned in the "Known Table Tags". > > [VL] Accepted, the link was updated to point to the 2015 (3rd edition) of > the ISO document. > > > > File header with basic font type and version, along with offsets to > metadata and private data blocks. > An optional block of extended metadata, represented in XML format and > compressed for storage in the WOFF2 file. > > There are new lines after "private" and after "compressed". I'm not sure > whether it's on purpose. > > [VL] Edited to remove them. > > > > The pseudo-code describing how to read the 255UInt16 format is presented > below: > > I think it should be mentioned that this is "C-like" pseudo-code to make > clear that bitwise operators <<, & are used. > > [VL] Accepted > > > In general maybe it should probably be said somewhere in the introduction > that the "0x..." notations used everywhere in this specification correspond > to hexadecimal values. > > [VL] Accepted > > > > An encoder should choose shorter encodings, and must be consistent in > choice of encoding for the same value, as this will tend to compress better. > > I have the impression that the specification should explicitly provide a > way to get a shortest encoding. For example some pseudo code describing the > following transformation: > > 0 ≤ value < lowestUCode ---> [value on 1 byte] > > lowestUCode ≤ value < 2*lowestUCode ---> [oneMoreByteCode1][value - > lowestUCode on 1 byte] > > 2*lowestUCode ≤ value < 2*lowestUCode + 256 ---> [oneMoreByteCode2][value > - 2*lowestUCode on one byte] > > value ≥ 2*lowestUCode + 256 ---> [wordCode][value on 2 bytes] > > Actually, this is probably insignificant but I wonder why > [oneMoreByteCode2][code] is not interpreted as lowestUCode + 256 + code so > that code1 & code2 encodings won't overlap. Does the encoding proposed in > the spec tend to compress better? > > [VL] Deferred for WG discussion > > > > Thus, a decoding procedure for a UIntBase128 is: start with value = 0. > Consume a byte, setting value = old value times 128 + (byte bitwise-and > 127). Repeat last step until the most significant bit of byte is false. > > I personally feel that this paragraph would be better presented with > pseudo-code. > > An encoder must not allow this to happen and must produce shortest > possible encoding. > > Here too I believe there should be some simple pseudo-code to explain the > canonical & optimal way to write the 32bits of the integer on at most five > 7-bit blocks (even if that's obvious once you think about it). > > [VL] Deferred for WG discussion > > > > The interpretation of the WOFF2 Header is the same as the WOFF Header in > [WOFF1]. > > Maybe it should be highlighted that the only new field with respect to > WOFF version 1 is totalCompressedSize? > > [VL] Accepted > > > > The font directory section consist of > > consists > > [VL] Fixed, thank you > > > > Whether a table tag is encoded with a known table tag or explicitly > including the four-byte tag has no semantic significance; it is simply a > choice of encoding intended to improve compression efficiency. > > The known table flag values should not be relied upon in determining the > presence of the transformed tables, it is feasible that e.g. the glyf table > can be represented in the table directory with either flag = 10 and no tag, > or with flag = 63 and 'glyf' tag that follows. > > I wonder why the encoder is not forced to use the known flag when it is > available for a table? Of course the size gain is negligible and the table > directory is not compressed anyway, but that would be consistent with other > places of the spec where it tries to get the optimal encoding when possible. > > [VL] Deferred for WG discussion > > > > if the table is transformed, then the version number of the applied > transform is defined by the two most significatn flag bits > > significant > > [VL] Fixed, thank you > > > > The decompressed and reconstructed table data MUST be stored in the format > specified by the [OFF] specification. Each reconstructed table directory > entry MUST contain a valid 'checkSum' value, the decoder MUST recalculate > the checkSum value for each decoded table. Also, due to modifying > transforms applied to glyf and loca tables, the decoder MUST recalculate > the checkSumAdjustmentglyf value of the entire font and MUST store the > updated value in the head table. > > "checkSumAdjustmentglyf" should be "checkSumAdjustment", I guess. > > [VL] Fixed, thank you > > > I find that this paragraph is a bit confusing. It seems to suggest that > only the checkSumAdjustment may change and that it is only 'due to > modifying transforms applied to glyf and loca tables'. At the same time it > seems to ask the decoders to recalculate all the checksums without any > verification for possible unchanged checksums. > > My understanding is that only transformed tables that can not be > reconstructed identically to the original will have the checksum > invalidated (i.e. 'glyf' and 'loca' but not 'hmtx' or other tables). And > that in general the overall checkSumAdjustment will be invalidated. Is that > correct? > > [VL] Yes, this is correct. > > > The possible binary changes are better explained in the "The WOFF 2.0 > transformations applied to certain tables...." paragraph below that one, > maybe this section should be reordered a bit or this decoder+checksum > paragraph should point to the paragraph below? > > Why isn't the decoder asked to verify the checksum that are guaranteed to > be preserved and to just recalculate everything? Is it a requirement of the > OFF specification when bit 11 of the 'flags' field of the head is set? > > [VL] Deferred for WG discussion > > The total number of bytes in bboxBitmap is equal to 4 * ((numGlyphs + 31) > / 32). > > Again, C integer-division is implicit here. I would suggest writing "4 * ⌊ > (numGlyphs + 31) / 32 ⌋" or even better if you want to enhance rendering > in some web engines and assistive technologies: > > <math><semantics><mrow><mn>4</mn><mrow><mo>⌊ > </mo><mfrac><mrow><mtext>numGlyphs</mtext><mo>+</mo><mn>31</mn></mrow><mn>32</mn></mfrac><mo> > ⌋</mo></mrow></mrow><annotation encoding="text/plain">4 * ⌊ (numGlyphs + > 31) / 32 ⌋</annotation></semantics></math> > > 4⌊numGlyphs+3132⌋4 * ⌊ (numGlyphs + 31) / 32 ⌋ > > [VL] I modified the notation to reflect the nature of implicit integer > division. > > Upon reading the Transformed glyf Table , the decoding process iterates > one glyph at a time. For each glyph, it reads zero or more bytes from each > of the streams referenced in the Transformed glyf Table . > > There are extra spaces before the comma and the period. > > [VL] Fixed, thank you > > > > FLAG_MORE_COMPONENTS bit (1 << 5), FLAG_WE_HAVE_INSTRUCTIONS bit (1 << 8) > > Note that the Open Font Format (and TTF and OpenType) specifications use > MORE_COMPONENTS and WE_HAVE_INSTRUCTIONS (without the FLAG_ prefix). Again, > C bit shifting operator is used without prior mention. As a comparison, the > font specifications use "bit 0", "bit 1" etc > > [VL] Fixed, thank you > > > > If the hmtx table transform is both applicable and desired, the encoder > MUST check that leftSideBearing values match the xMin values of the glyph > bounding box for every glyph in a font file and, if the conditions are met > for each of the proportional or monospaced glyph runs the encoder MUST set > hmtx transform version number to "1", MUST eliminate the corresponding > array from the hmtx table and MUST set the appropriate Flags bits. > > I guess "leftSideBearing" must be understood as lsb[] and/or > leftsideBearing[] ; and that "desired" means "desired by the encoder". > > [VL] lsb[] and leftSideBearing[] here are the references to identically > named arrays of the hmtx table ( > http://www.microsoft.com/typography/otspec/hmtx.htm). Desirability of > transform will most likely be determined by the outside context, e.g. one > may want to apply all applicable transforms when encoding a full font file > but skip the transforms and improve performance if a small font subset is > encoded as WOFF2 and optimizing already reduced tables doesn’t make much > difference in terms of compression efficiency. > > > > >
Received on Tuesday, 3 November 2015 16:32:52 UTC