Re: Integer Representation in header-compression-draft-03 from Fred Akalin on 2013-10-18 (ietf-http-wg@w3.org from October to December 2013)

From: Fred Akalin <akalin@google.com>
Date: Thu, 17 Oct 2013 17:17:55 -0700
To: "Kulkarni, Saurabh" <sakulkar@akamai.com>
Cc: Mike Bishop <Michael.Bishop@microsoft.com>, Roberto Peon <grmocg@gmail.com>, Patrick McManus <pmcmanus@mozilla.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CANUYc_QOnk59ctYYekqS=K91_m=wZv9Dawvpip37y9jQmuMDTg@mail.gmail.com>
N only applies to the first (possibly partial) byte, which is treated
specially (since it has a different continuation pattern). It would be
incorrect to have an "N" for subsequent bytes.


On Thu, Oct 17, 2013 at 5:14 PM, Kulkarni, Saurabh <sakulkar@akamai.com>wrote:

> Note that even though N=8 is good, do we need to explicitly mention that
> the subsequent bytes are N=7? Because in essence we are using the MSB for
> encoding the traversal for next byte. Not sure whether that will make it
> clearer or more confusing tho.
>
> - Saurabh
>
> From: Mike Bishop <Michael.Bishop@microsoft.com>
> Date: Thursday, October 17, 2013 5:03 PM
> To: Roberto Peon <grmocg@gmail.com>, Patrick McManus <pmcmanus@mozilla.com
> >
> Cc: Saurabh Kulkarni <sakulkar@akamai.com>, HTTP Working Group <
> ietf-http-wg@w3.org>
> Subject: RE: Integer Representation in header-compression-draft-03
>
> That works, though I think “8-bit” should be hyphenated.  Thanks for the
> quick turnaround!****
>
> ** **
>
> *From:* Roberto Peon [mailto:grmocg@gmail.com <grmocg@gmail.com>]
> *Sent:* Thursday, October 17, 2013 4:59 PM
> *To:* Mike Bishop; Patrick McManus
> *Cc:* Kulkarni, Saurabh; HTTP Working Group
> *Subject:* Re: Integer Representation in header-compression-draft-03****
>
> ** **
>
> +Patrick so hopefully he notices this.****
>
> ** **
>
> I tried your suggestion, but found it jarring :/****
>
> I stuck more explanation in the integer encoding section, which now reads
> (I've italicized and made bold the additions):****
>
> Integers are used to represent name indexes, pair indexes or string
> lengths. To allow for optimized processing, an integer representation
> always finishes at the end of a byte.****
>
> An integer is represented in two parts: a prefix that fills the current
> byte and an optional list of bytes that are used if the integer value does
> not fit within the prefix. The number of bits of the prefix (called N) is a
> parameter of the integer representation.****
>
> The N-bit prefix allows filling the current byte. If the value is small
> enough (strictly less than 2-1), it is encoded within the N-bit prefix.
> Otherwise all the bits of the prefix are set to 1 and the value is encoded
> using an *unsigned variable length integer*<http://en.wikipedia.org/wiki/Variable-length_quantity>
> representation.* N is always between 1 and 8 bits. An integer starting at
> a byte-boundary will have an 8 bit prefix.*****
>
> The algorithm to represent an integer I is as follows:****
>
> ...****
>
> How does that look?****
>
> -=R****
>
> ** **
>
> On Thu, Oct 17, 2013 at 4:49 PM, Mike Bishop <Michael.Bishop@microsoft.com>
> wrote:****
>
> I agree – an 8-bit prefix allows for more values to be in a single byte,
> so I’m not at all opposed to writing it in; we just need to be explicit.**
> **
>
>  ****
>
> Looking back at -03, 4.3.3 explicitly calls out a byte-aligned integer as
> being a “0-bit” prefix.  No other byte-aligned integer specifies a prefix
> length, hence my assumption (and presumably Patrick’s).  That section has
> been removed in the current draft, since it’s the definition of
> substitution, so we don’t have to worry about reconciling it.  It would be
> good to explicitly state 8-bit prefix anywhere we reference a byte-aligned
> integer; 4.1.2 #1 is the only one I see off-hand.****
>
>  ****
>
> *From:* Roberto Peon [mailto:grmocg@gmail.com]
> *Sent:* Thursday, October 17, 2013 4:45 PM
> *To:* Mike Bishop
> *Cc:* Kulkarni, Saurabh; HTTP Working Group****
>
>
> *Subject:* Re: Integer Representation in header-compression-draft-03****
>
>  ****
>
> I've integrated Fred's suggestion into the github spec version (i.e. N is
> always between 1 and 8)****
>
>  ****
>
> Mike-- any suggestions on further clarification?****
>
>  ****
>
> (imho, it is suboptimal to assume N=0, as you lose 127 points of codespace
> instead of only one.)****
>
> -=R****
>
>  ****
>
> On Thu, Oct 17, 2013 at 4:41 PM, Mike Bishop <Michael.Bishop@microsoft.com>
> wrote:****
>
> Looks like an interpretational difference that needs to be clarified,
> because Firefox looks exactly correct to me.****
>
>  ****
>
> I had interpreted a field being “8+” bits long would be a zero-bit prefix
> integer.  (i.e. N=0, so the partial byte is absent, and you always have at
> least one byte which can represent numbers 0-127)  Certain instances
> explicitly call out zero-bit prefixes on byte boundaries, so I assumed they
> all were.  The spec needs to be consistent about whether integers starting
> on a byte boundary have an eight-bit or a zero-bit prefix, and an example
> would be good for this.****
>
>  ****
>
> With a zero-bit prefix, that’s the correct encoding for 159.  159 is
> 0b10011111.  You only get seven bits of value in the first byte because one
> is reserved for the continuation – which just happens to be the same bit
> that would be set if representing 159 on eight bits.  So the first byte is
> 0b10011111, followed by a second byte with the extra bit, 0b00000001.****
>
>  ****
>
> *From:* Roberto Peon [mailto:grmocg@gmail.com]
> *Sent:* Thursday, October 17, 2013 4:37 PM
> *To:* Kulkarni, Saurabh
> *Cc:* HTTP Working Group
> *Subject:* Re: Integer Representation in header-compression-draft-03****
>
>  ****
>
> Saurabh--****
>
>  ****
>
> Thanks for this.****
>
> It looks like Firefox is getting this wrong, per my interpretation of what
> is supposed to happen here.****
>
> Indeed, though poorly specified, the intent is for the name-length and
> value-list-length fields, N is 8 since there are 8 bits available for
> length up to the next byte boundary, and so any value under 0xFF is (or
> should be) encodable on that byte.****
>
>  ****
>
> -=R****
>
>  ****
>
> On Thu, Oct 17, 2013 at 4:23 PM, Kulkarni, Saurabh <sakulkar@akamai.com>
> wrote:****
>
> I was debugging my server (Akamai Ghost) with Firefox nightly for draft-06
> and noticed a discrepancy with the way integer values are being represented
> in header compression. I shot an individual mail to Patrick just in case
> this is a false alarm, or people talked about this offline.****
>
>  ****
>
> So header-compression-draft-03 says:****
>
> "The N-bit prefix allows filling the current byte. If the value is****
>
>  small enough (strictly less than 2^N-1), it is encoded within the****
>
>  N-bit prefix. Otherwise all the bits of the prefix are set to 1 and****
>
>  the value is encoded using an unsigned variable length integer [1]****
>
>  representation."****
>
>  ****
>
> For representing lengths of header values the draft-03 says its 8+ meaning
> N=8. Which corresponds to <255 values can be encoded in 1 byte. But since
> the algorithm uses the MSB for signaling whether to consume the next byte,
> henceforth N needs to be 7. This is potentially confusing. I encountered
> this issue when I received a cookie value of length 159 which can
> potentially be encoded as 1/2 bytes (which is true to all values > 128 and
> < 255). ****
>
>  ****
>
> Firefox encoded this as: 159 = \159\001, but it can also be encoded as
> just \159.****
>
>  ****
>
> Please clarify the text in the draft, because +/- 1 byte can throw-off the
> compressor completely for the subsequent values.****
>
>  ****
>
> Thanks,****
>
> Saurabh****
>
>  ****
>
>  ****
>
>  ****
>
> ** **
>
Received on Friday, 18 October 2013 00:18:23 UTC