Re: Integer Representation in header-compression-draft-03 from Roberto Peon on 2013-10-17 (ietf-http-wg@w3.org from October to December 2013)

From: Roberto Peon <grmocg@gmail.com>
Date: Thu, 17 Oct 2013 16:58:50 -0700
To: Mike Bishop <Michael.Bishop@microsoft.com>, Patrick McManus <pmcmanus@mozilla.com>
Cc: "Kulkarni, Saurabh" <sakulkar@akamai.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAP+FsNexcMQ1oipVqD9_4EUJuJsavWfJDk2vq_xERac09pdWDw@mail.gmail.com>
+Patrick so hopefully he notices this.

I tried your suggestion, but found it jarring :/
I stuck more explanation in the integer encoding section, which now reads
(I've italicized and made bold the additions):

Integers are used to represent name indexes, pair indexes or string
lengths. To allow for optimized processing, an integer representation
always finishes at the end of a byte.

An integer is represented in two parts: a prefix that fills the current
byte and an optional list of bytes that are used if the integer value does
not fit within the prefix. The number of bits of the prefix (called N) is a
parameter of the integer representation.

The N-bit prefix allows filling the current byte. If the value is small
enough (strictly less than 2-1), it is encoded within the N-bit prefix.
Otherwise all the bits of the prefix are set to 1 and the value is encoded
using an unsigned variable length
integer<http://en.wikipedia.org/wiki/Variable-length_quantity>
representation.* N is always between 1 and 8 bits. An integer starting at a
byte-boundary will have an 8 bit prefix.*

The algorithm to represent an integer I is as follows:

...
How does that look?
-=R


On Thu, Oct 17, 2013 at 4:49 PM, Mike Bishop
<Michael.Bishop@microsoft.com>wrote:

>  I agree – an 8-bit prefix allows for more values to be in a single byte,
> so I’m not at all opposed to writing it in; we just need to be explicit.**
> **
>
> ** **
>
> Looking back at -03, 4.3.3 explicitly calls out a byte-aligned integer as
> being a “0-bit” prefix.  No other byte-aligned integer specifies a prefix
> length, hence my assumption (and presumably Patrick’s).  That section has
> been removed in the current draft, since it’s the definition of
> substitution, so we don’t have to worry about reconciling it.  It would be
> good to explicitly state 8-bit prefix anywhere we reference a byte-aligned
> integer; 4.1.2 #1 is the only one I see off-hand.****
>
> ** **
>
> *From:* Roberto Peon [mailto:grmocg@gmail.com]
> *Sent:* Thursday, October 17, 2013 4:45 PM
> *To:* Mike Bishop
> *Cc:* Kulkarni, Saurabh; HTTP Working Group
>
> *Subject:* Re: Integer Representation in header-compression-draft-03****
>
> ** **
>
> I've integrated Fred's suggestion into the github spec version (i.e. N is
> always between 1 and 8)****
>
> ** **
>
> Mike-- any suggestions on further clarification?****
>
> ** **
>
> (imho, it is suboptimal to assume N=0, as you lose 127 points of codespace
> instead of only one.)****
>
> -=R****
>
> ** **
>
> On Thu, Oct 17, 2013 at 4:41 PM, Mike Bishop <Michael.Bishop@microsoft.com>
> wrote:****
>
>  Looks like an interpretational difference that needs to be clarified,
> because Firefox looks exactly correct to me.****
>
>  ****
>
> I had interpreted a field being “8+” bits long would be a zero-bit prefix
> integer.  (i.e. N=0, so the partial byte is absent, and you always have at
> least one byte which can represent numbers 0-127)  Certain instances
> explicitly call out zero-bit prefixes on byte boundaries, so I assumed they
> all were.  The spec needs to be consistent about whether integers starting
> on a byte boundary have an eight-bit or a zero-bit prefix, and an example
> would be good for this.****
>
>  ****
>
> With a zero-bit prefix, that’s the correct encoding for 159.  159 is
> 0b10011111.  You only get seven bits of value in the first byte because one
> is reserved for the continuation – which just happens to be the same bit
> that would be set if representing 159 on eight bits.  So the first byte is
> 0b10011111, followed by a second byte with the extra bit, 0b00000001.****
>
>  ****
>
> *From:* Roberto Peon [mailto:grmocg@gmail.com]
> *Sent:* Thursday, October 17, 2013 4:37 PM
> *To:* Kulkarni, Saurabh
> *Cc:* HTTP Working Group
> *Subject:* Re: Integer Representation in header-compression-draft-03****
>
>  ****
>
> Saurabh--****
>
>  ****
>
> Thanks for this.****
>
> It looks like Firefox is getting this wrong, per my interpretation of what
> is supposed to happen here.****
>
> Indeed, though poorly specified, the intent is for the name-length and
> value-list-length fields, N is 8 since there are 8 bits available for
> length up to the next byte boundary, and so any value under 0xFF is (or
> should be) encodable on that byte.****
>
>  ****
>
> -=R****
>
>  ****
>
> On Thu, Oct 17, 2013 at 4:23 PM, Kulkarni, Saurabh <sakulkar@akamai.com>
> wrote:****
>
>  I was debugging my server (Akamai Ghost) with Firefox nightly for
> draft-06 and noticed a discrepancy with the way integer values are being
> represented in header compression. I shot an individual mail to Patrick
> just in case this is a false alarm, or people talked about this offline.**
> **
>
>  ****
>
> So header-compression-draft-03 says:****
>
> "The N-bit prefix allows filling the current byte. If the value is****
>
>  small enough (strictly less than 2^N-1), it is encoded within the****
>
>  N-bit prefix. Otherwise all the bits of the prefix are set to 1 and****
>
>  the value is encoded using an unsigned variable length integer [1]****
>
>  representation."****
>
>  ****
>
> For representing lengths of header values the draft-03 says its 8+ meaning
> N=8. Which corresponds to <255 values can be encoded in 1 byte. But since
> the algorithm uses the MSB for signaling whether to consume the next byte,
> henceforth N needs to be 7. This is potentially confusing. I encountered
> this issue when I received a cookie value of length 159 which can
> potentially be encoded as 1/2 bytes (which is true to all values > 128 and
> < 255). ****
>
>  ****
>
> Firefox encoded this as: 159 = \159\001, but it can also be encoded as
> just \159.****
>
>  ****
>
> Please clarify the text in the draft, because +/- 1 byte can throw-off the
> compressor completely for the subsequent values.****
>
>  ****
>
> Thanks,****
>
> Saurabh****
>
>  ****
>
>   ****
>
>  ** **
>
Received on Thursday, 17 October 2013 23:59:18 UTC