Re: Comments on the WOFF 2.0 Working Draft 9 October 2015 from Roderick Sheeter on 2015-11-03 (public-webfonts-wg@w3.org from November 2015)

From: Roderick Sheeter <rsheeter@google.com>
Date: Tue, 3 Nov 2015 08:32:23 -0800
To: "Levantovsky, Vladimir" <Vladimir.Levantovsky@monotype.com>
Cc: Frédéric WANG <fred.wang@free.fr>, "www-font@w3.org" <www-font@w3.org>, "w3c-webfonts-wg (public-webfonts-wg@w3.org)" <public-webfonts-wg@w3.org>
Message-ID: <CABscrrH4ZX-a18YJ6jrTWjUNFaYss-uokGhnPJHkOsE4e=j7jA@mail.gmail.com>
In the line 844/852 diff it looks like something unfortunate happened to
the character encoding. The right hand side - 4 * âŒŠ(numGlyphs + 31) /
32âŒ‹. - looks off.

l find (1 << 8) clearer than (bit 8). If we are going to say "bit 8" did we
define somewhere how to interpret that or is there a standard
interpretation?

On Mon, Nov 2, 2015 at 2:18 PM, Levantovsky, Vladimir <
Vladimir.Levantovsky@monotype.com> wrote:

> Dear Frédéric,
>
>
>
> Thank you very much for taking the time to review the WOFF2 spec and
> providing your comments, please see the disposition notes inline. The most
> recent version of the Editor’s Draft WOFF2 specification is available at
> http://dev.w3.org/webfonts/WOFF2/spec/ and the implemented changes are at
> http://dev.w3.org/cvsweb/webfonts/WOFF2/spec/Overview.html.diff?r1=1.69;r2=1.70;f=h
>
>
>
> With kind regards,
>
> Vladimir
>
>
>
>
>
> *From:* Frédéric WANG [mailto:fred.wang@free.fr]
> *Sent:* Sunday, October 18, 2015 4:20 PM
> *To:* www-font@w3.org
> *Subject:* Comments on the WOFF 2.0 Working Draft 9 October 2015
>
>
>
> Dear WebFonts Working Group,
>
> Please find below my personal comments after a first reading of the WOFF
> 2.0 specification (without any attempts to implement it). I hope they will
> be helpful.
>
> Cheers,
>
> Frédéric Wang
>
>
> The input font file may contain a number of various font data tables
> described in the clause 5 of the [OFF] specification.
>
> Could you please update the reference link to use edition 3 of the Open
> Font Format specification now that it is released? Especially since the new
> MATH table is mentioned in the "Known Table Tags".
>
> [VL] Accepted, the link was updated to point to the 2015 (3rd edition) of
> the ISO document.
>
>
>
> File header with basic font type and version, along with offsets to
> metadata and private data blocks.
> An optional block of extended metadata, represented in XML format and
> compressed for storage in the WOFF2 file.
>
> There are new lines after "private" and after "compressed". I'm not sure
> whether it's on purpose.
>
> [VL] Edited to remove them.
>
>
>
> The pseudo-code describing how to read the 255UInt16 format is presented
> below:
>
> I think it should be mentioned that this is "C-like" pseudo-code to make
> clear that bitwise operators <<, & are used.
>
> [VL] Accepted
>
>
> In general maybe it should probably be said somewhere in the introduction
> that the "0x..." notations used everywhere in this specification correspond
> to hexadecimal values.
>
> [VL] Accepted
>
>
>
> An encoder should choose shorter encodings, and must be consistent in
> choice of encoding for the same value, as this will tend to compress better.
>
> I have the impression that the specification should explicitly provide a
> way to get a shortest encoding. For example some pseudo code describing the
> following transformation:
>
> 0 ≤ value < lowestUCode ---> [value on 1 byte]
>
> lowestUCode ≤ value < 2*lowestUCode ---> [oneMoreByteCode1][value -
> lowestUCode on 1 byte]
>
> 2*lowestUCode ≤ value < 2*lowestUCode + 256 ---> [oneMoreByteCode2][value
> - 2*lowestUCode on one byte]
>
> value ≥ 2*lowestUCode + 256 ---> [wordCode][value on 2 bytes]
>
> Actually, this is probably insignificant but I wonder why
> [oneMoreByteCode2][code] is not interpreted as lowestUCode + 256 + code so
> that code1 & code2 encodings won't overlap. Does the encoding proposed in
> the spec tend to compress better?
>
> [VL] Deferred for WG discussion
>
>
>
> Thus, a decoding procedure for a UIntBase128 is: start with value = 0.
> Consume a byte, setting value = old value times 128 + (byte bitwise-and
> 127). Repeat last step until the most significant bit of byte is false.
>
> I personally feel that this paragraph would be better presented with
> pseudo-code.
>
> An encoder must not allow this to happen and must produce shortest
> possible encoding.
>
> Here too I believe there should be some simple pseudo-code to explain the
> canonical & optimal way to write the 32bits of the integer on at most five
> 7-bit blocks (even if that's obvious once you think about it).
>
> [VL] Deferred for WG discussion
>
>
>
> The interpretation of the WOFF2 Header is the same as the WOFF Header in
> [WOFF1].
>
> Maybe it should be highlighted that the only new field with respect to
> WOFF version 1 is totalCompressedSize?
>
> [VL] Accepted
>
>
>
> The font directory section consist of
>
> consists
>
> [VL] Fixed, thank you
>
>
>
> Whether a table tag is encoded with a known table tag or explicitly
> including the four-byte tag has no semantic significance; it is simply a
> choice of encoding intended to improve compression efficiency.
>
> The known table flag values should not be relied upon in determining the
> presence of the transformed tables, it is feasible that e.g. the glyf table
> can be represented in the table directory with either flag = 10 and no tag,
> or with flag = 63 and 'glyf' tag that follows.
>
> I wonder why the encoder is not forced to use the known flag when it is
> available for a table? Of course the size gain is negligible and the table
> directory is not compressed anyway, but that would be consistent with other
> places of the spec where it tries to get the optimal encoding when possible.
>
> [VL] Deferred for WG discussion
>
>
>
> if the table is transformed, then the version number of the applied
> transform is defined by the two most significatn flag bits
>
> significant
>
> [VL] Fixed, thank you
>
>
>
> The decompressed and reconstructed table data MUST be stored in the format
> specified by the [OFF] specification. Each reconstructed table directory
> entry MUST contain a valid 'checkSum' value, the decoder MUST recalculate
> the checkSum value for each decoded table. Also, due to modifying
> transforms applied to glyf and loca tables, the decoder MUST recalculate
> the checkSumAdjustmentglyf value of the entire font and MUST store the
> updated value in the head table.
>
> "checkSumAdjustmentglyf" should be "checkSumAdjustment", I guess.
>
> [VL] Fixed, thank you
>
>
> I find that this paragraph is a bit confusing. It seems to suggest that
> only the checkSumAdjustment may change and that it is only 'due to
> modifying transforms applied to glyf and loca tables'. At the same time it
> seems to ask the decoders to recalculate all the checksums without any
> verification for possible unchanged checksums.
>
> My understanding is that only transformed tables that can not be
> reconstructed identically to the original will have the checksum
> invalidated (i.e. 'glyf' and 'loca' but not 'hmtx' or other tables). And
> that in general the overall checkSumAdjustment will be invalidated. Is that
> correct?
>
> [VL] Yes, this is correct.
>
>
> The possible binary changes are better explained in the "The WOFF 2.0
> transformations applied to certain tables...." paragraph below that one,
> maybe this section should be reordered a bit or this decoder+checksum
> paragraph should point to the paragraph below?
>
> Why isn't the decoder asked to verify the checksum that are guaranteed to
> be preserved and to just recalculate everything? Is it a requirement of the
> OFF specification when bit 11 of the 'flags' field of the head is set?
>
> [VL] Deferred for WG discussion
>
> The total number of bytes in bboxBitmap is equal to 4 * ((numGlyphs + 31)
> / 32).
>
> Again, C integer-division is implicit here. I would suggest writing "4 * ⌊
> (numGlyphs + 31) / 32 ⌋" or even better if you want to enhance rendering
> in some web engines and assistive technologies:
>
> <math><semantics><mrow><mn>4</mn><mrow><mo>⌊
> </mo><mfrac><mrow><mtext>numGlyphs</mtext><mo>+</mo><mn>31</mn></mrow><mn>32</mn></mfrac><mo>
> ⌋</mo></mrow></mrow><annotation encoding="text/plain">4 * ⌊ (numGlyphs +
> 31) / 32 ⌋</annotation></semantics></math>
>
> 4⌊numGlyphs+3132⌋4 * ⌊ (numGlyphs + 31) / 32 ⌋
>
> [VL] I modified the notation to reflect the nature of implicit integer
> division.
>
> Upon reading the Transformed glyf Table , the decoding process iterates
> one glyph at a time. For each glyph, it reads zero or more bytes from each
> of the streams referenced in the Transformed glyf Table .
>
> There are extra spaces before the comma and the period.
>
> [VL] Fixed, thank you
>
>
>
> FLAG_MORE_COMPONENTS bit (1 << 5), FLAG_WE_HAVE_INSTRUCTIONS bit (1 << 8)
>
> Note that the Open Font Format (and TTF and OpenType) specifications use
> MORE_COMPONENTS and WE_HAVE_INSTRUCTIONS (without the FLAG_ prefix). Again,
> C bit shifting operator is used without prior mention. As a comparison, the
> font specifications use "bit 0", "bit 1" etc
>
> [VL] Fixed, thank you
>
>
>
> If the hmtx table transform is both applicable and desired, the encoder
> MUST check that leftSideBearing values match the xMin values of the glyph
> bounding box for every glyph in a font file and, if the conditions are met
> for each of the proportional or monospaced glyph runs the encoder MUST set
> hmtx transform version number to "1", MUST eliminate the corresponding
> array from the hmtx table and MUST set the appropriate Flags bits.
>
> I guess "leftSideBearing" must be understood as lsb[] and/or
> leftsideBearing[] ; and that "desired" means "desired by the encoder".
>
> [VL] lsb[] and leftSideBearing[] here are the references to identically
> named arrays of the hmtx table (
> http://www.microsoft.com/typography/otspec/hmtx.htm). Desirability of
> transform will most likely be determined by the outside context, e.g. one
> may want to apply all applicable transforms when encoding a full font file
> but skip the transforms and improve performance if a small font subset is
> encoded as WOFF2 and optimizing already reduced tables doesn’t make much
> difference in terms of compression efficiency.
>
>
>
>
>
Received on Tuesday, 3 November 2015 16:32:52 UTC