Re: Integer Representation in header-compression-draft-03 from Roberto Peon on 2013-10-18 (ietf-http-wg@w3.org from October to December 2013)

From: Roberto Peon <grmocg@gmail.com>
Date: Thu, 17 Oct 2013 17:26:30 -0700
To: Fred Akalin <akalin@google.com>
Cc: "Kulkarni, Saurabh" <sakulkar@akamai.com>, Mike Bishop <Michael.Bishop@microsoft.com>, Patrick McManus <pmcmanus@mozilla.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAP+FsNcXnhCtCg86+5NS_sPj+7DDEpOqwhBG7PJFZH2WDnRrdw@mail.gmail.com>
I think the spec is clear about the subsequent bytes, both by specifying
that they're var-int encoded, and via the pseudocode.

-=R


On Thu, Oct 17, 2013 at 5:17 PM, Fred Akalin <akalin@google.com> wrote:

> N only applies to the first (possibly partial) byte, which is treated
> specially (since it has a different continuation pattern). It would be
> incorrect to have an "N" for subsequent bytes.
>
>
> On Thu, Oct 17, 2013 at 5:14 PM, Kulkarni, Saurabh <sakulkar@akamai.com>wrote:
>
>> Note that even though N=8 is good, do we need to explicitly mention that
>> the subsequent bytes are N=7? Because in essence we are using the MSB for
>> encoding the traversal for next byte. Not sure whether that will make it
>> clearer or more confusing tho.
>>
>> - Saurabh
>>
>> From: Mike Bishop <Michael.Bishop@microsoft.com>
>> Date: Thursday, October 17, 2013 5:03 PM
>> To: Roberto Peon <grmocg@gmail.com>, Patrick McManus <
>> pmcmanus@mozilla.com>
>> Cc: Saurabh Kulkarni <sakulkar@akamai.com>, HTTP Working Group <
>> ietf-http-wg@w3.org>
>> Subject: RE: Integer Representation in header-compression-draft-03
>>
>> That works, though I think “8-bit” should be hyphenated.  Thanks for the
>> quick turnaround!****
>>
>> ** **
>>
>> *From:* Roberto Peon [mailto:grmocg@gmail.com <grmocg@gmail.com>]
>> *Sent:* Thursday, October 17, 2013 4:59 PM
>> *To:* Mike Bishop; Patrick McManus
>> *Cc:* Kulkarni, Saurabh; HTTP Working Group
>> *Subject:* Re: Integer Representation in header-compression-draft-03****
>>
>> ** **
>>
>> +Patrick so hopefully he notices this.****
>>
>> ** **
>>
>> I tried your suggestion, but found it jarring :/****
>>
>> I stuck more explanation in the integer encoding section, which now reads
>> (I've italicized and made bold the additions):****
>>
>> Integers are used to represent name indexes, pair indexes or string
>> lengths. To allow for optimized processing, an integer representation
>> always finishes at the end of a byte.****
>>
>> An integer is represented in two parts: a prefix that fills the current
>> byte and an optional list of bytes that are used if the integer value does
>> not fit within the prefix. The number of bits of the prefix (called N) is a
>> parameter of the integer representation.****
>>
>> The N-bit prefix allows filling the current byte. If the value is small
>> enough (strictly less than 2-1), it is encoded within the N-bit prefix.
>> Otherwise all the bits of the prefix are set to 1 and the value is encoded
>> using an *unsigned variable length integer*<http://en.wikipedia.org/wiki/Variable-length_quantity>
>> representation.* N is always between 1 and 8 bits. An integer starting
>> at a byte-boundary will have an 8 bit prefix.*****
>>
>> The algorithm to represent an integer I is as follows:****
>>
>> ...****
>>
>> How does that look?****
>>
>> -=R****
>>
>> ** **
>>
>> On Thu, Oct 17, 2013 at 4:49 PM, Mike Bishop <
>> Michael.Bishop@microsoft.com> wrote:****
>>
>> I agree – an 8-bit prefix allows for more values to be in a single byte,
>> so I’m not at all opposed to writing it in; we just need to be explicit.*
>> ***
>>
>>  ****
>>
>> Looking back at -03, 4.3.3 explicitly calls out a byte-aligned integer as
>> being a “0-bit” prefix.  No other byte-aligned integer specifies a prefix
>> length, hence my assumption (and presumably Patrick’s).  That section has
>> been removed in the current draft, since it’s the definition of
>> substitution, so we don’t have to worry about reconciling it.  It would be
>> good to explicitly state 8-bit prefix anywhere we reference a byte-aligned
>> integer; 4.1.2 #1 is the only one I see off-hand.****
>>
>>  ****
>>
>> *From:* Roberto Peon [mailto:grmocg@gmail.com]
>> *Sent:* Thursday, October 17, 2013 4:45 PM
>> *To:* Mike Bishop
>> *Cc:* Kulkarni, Saurabh; HTTP Working Group****
>>
>>
>> *Subject:* Re: Integer Representation in header-compression-draft-03****
>>
>>  ****
>>
>> I've integrated Fred's suggestion into the github spec version (i.e. N is
>> always between 1 and 8)****
>>
>>  ****
>>
>> Mike-- any suggestions on further clarification?****
>>
>>  ****
>>
>> (imho, it is suboptimal to assume N=0, as you lose 127 points of
>> codespace instead of only one.)****
>>
>> -=R****
>>
>>  ****
>>
>> On Thu, Oct 17, 2013 at 4:41 PM, Mike Bishop <
>> Michael.Bishop@microsoft.com> wrote:****
>>
>> Looks like an interpretational difference that needs to be clarified,
>> because Firefox looks exactly correct to me.****
>>
>>  ****
>>
>> I had interpreted a field being “8+” bits long would be a zero-bit prefix
>> integer.  (i.e. N=0, so the partial byte is absent, and you always have at
>> least one byte which can represent numbers 0-127)  Certain instances
>> explicitly call out zero-bit prefixes on byte boundaries, so I assumed they
>> all were.  The spec needs to be consistent about whether integers starting
>> on a byte boundary have an eight-bit or a zero-bit prefix, and an example
>> would be good for this.****
>>
>>  ****
>>
>> With a zero-bit prefix, that’s the correct encoding for 159.  159 is
>> 0b10011111.  You only get seven bits of value in the first byte because one
>> is reserved for the continuation – which just happens to be the same bit
>> that would be set if representing 159 on eight bits.  So the first byte is
>> 0b10011111, followed by a second byte with the extra bit, 0b00000001.****
>>
>>  ****
>>
>> *From:* Roberto Peon [mailto:grmocg@gmail.com]
>> *Sent:* Thursday, October 17, 2013 4:37 PM
>> *To:* Kulkarni, Saurabh
>> *Cc:* HTTP Working Group
>> *Subject:* Re: Integer Representation in header-compression-draft-03****
>>
>>  ****
>>
>> Saurabh--****
>>
>>  ****
>>
>> Thanks for this.****
>>
>> It looks like Firefox is getting this wrong, per my interpretation of
>> what is supposed to happen here.****
>>
>> Indeed, though poorly specified, the intent is for the name-length and
>> value-list-length fields, N is 8 since there are 8 bits available for
>> length up to the next byte boundary, and so any value under 0xFF is (or
>> should be) encodable on that byte.****
>>
>>  ****
>>
>> -=R****
>>
>>  ****
>>
>> On Thu, Oct 17, 2013 at 4:23 PM, Kulkarni, Saurabh <sakulkar@akamai.com>
>> wrote:****
>>
>> I was debugging my server (Akamai Ghost) with Firefox nightly for
>> draft-06 and noticed a discrepancy with the way integer values are being
>> represented in header compression. I shot an individual mail to Patrick
>> just in case this is a false alarm, or people talked about this offline.*
>> ***
>>
>>  ****
>>
>> So header-compression-draft-03 says:****
>>
>> "The N-bit prefix allows filling the current byte. If the value is****
>>
>>  small enough (strictly less than 2^N-1), it is encoded within the****
>>
>>  N-bit prefix. Otherwise all the bits of the prefix are set to 1 and****
>>
>>  the value is encoded using an unsigned variable length integer [1]****
>>
>>  representation."****
>>
>>  ****
>>
>> For representing lengths of header values the draft-03 says its 8+
>> meaning N=8. Which corresponds to <255 values can be encoded in 1 byte. But
>> since the algorithm uses the MSB for signaling whether to consume the next
>> byte, henceforth N needs to be 7. This is potentially confusing. I
>> encountered this issue when I received a cookie value of length 159 which
>> can potentially be encoded as 1/2 bytes (which is true to all values > 128
>> and < 255). ****
>>
>>  ****
>>
>> Firefox encoded this as: 159 = \159\001, but it can also be encoded as
>> just \159.****
>>
>>  ****
>>
>> Please clarify the text in the draft, because +/- 1 byte can throw-off
>> the compressor completely for the subsequent values.****
>>
>>  ****
>>
>> Thanks,****
>>
>> Saurabh****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>> ** **
>>
>
>
Received on Friday, 18 October 2013 00:26:58 UTC