RE: Integer Representation in header-compression-draft-03

From: Mike Bishop <Michael.Bishop@microsoft.com>
Date: Thu, 17 Oct 2013 23:49:17 +0000
To: Roberto Peon <grmocg@gmail.com>
CC: "Kulkarni, Saurabh" <sakulkar@akamai.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <be20bf3adc3949f98bb9fffcf2e4c0ee@BY2PR03MB025.namprd03.prod.outlook.com>
I agree - an 8-bit prefix allows for more values to be in a single byte, so I'm not at all opposed to writing it in; we just need to be explicit.

Looking back at -03, 4.3.3 explicitly calls out a byte-aligned integer as being a "0-bit" prefix.  No other byte-aligned integer specifies a prefix length, hence my assumption (and presumably Patrick's).  That section has been removed in the current draft, since it's the definition of substitution, so we don't have to worry about reconciling it.  It would be good to explicitly state 8-bit prefix anywhere we reference a byte-aligned integer; 4.1.2 #1 is the only one I see off-hand.

I've integrated Fred's suggestion into the github spec version (i.e. N is always between 1 and 8)

Mike-- any suggestions on further clarification?

(imho, it is suboptimal to assume N=0, as you lose 127 points of codespace instead of only one.)

Looks like an interpretational difference that needs to be clarified, because Firefox looks exactly correct to me.

I had interpreted a field being "8+" bits long would be a zero-bit prefix integer.  (i.e. N=0, so the partial byte is absent, and you always have at least one byte which can represent numbers 0-127)  Certain instances explicitly call out zero-bit prefixes on byte boundaries, so I assumed they all were.  The spec needs to be consistent about whether integers starting on a byte boundary have an eight-bit or a zero-bit prefix, and an example would be good for this.

With a zero-bit prefix, that's the correct encoding for 159.  159 is 0b10011111.  You only get seven bits of value in the first byte because one is reserved for the continuation - which just happens to be the same bit that would be set if representing 159 on eight bits.  So the first byte is 0b10011111, followed by a second byte with the extra bit, 0b00000001.

Thanks for this.
It looks like Firefox is getting this wrong, per my interpretation of what is supposed to happen here.
Indeed, though poorly specified, the intent is for the name-length and value-list-length fields, N is 8 since there are 8 bits available for length up to the next byte boundary, and so any value under 0xFF is (or should be) encodable on that byte.


I was debugging my server (Akamai Ghost) with Firefox nightly for draft-06 and noticed a discrepancy with the way integer values are being represented in header compression. I shot an individual mail to Patrick just in case this is a false alarm, or people talked about this offline.

So header-compression-draft-03 says:
"The N-bit prefix allows filling the current byte. If the value is
 small enough (strictly less than 2^N-1), it is encoded within the
 N-bit prefix. Otherwise all the bits of the prefix are set to 1 and
 the value is encoded using an unsigned variable length integer [1]

For representing lengths of header values the draft-03 says its 8+ meaning N=8. Which corresponds to <255 values can be encoded in 1 byte. But since the algorithm uses the MSB for signaling whether to consume the next byte, henceforth N needs to be 7. This is potentially confusing. I encountered this issue when I received a cookie value of length 159 which can potentially be encoded as 1/2 bytes (which is true to all values > 128 and < 255).

Firefox encoded this as: 159 = \159\001, but it can also be encoded as just \159.

Please clarify the text in the draft, because +/- 1 byte can throw-off the compressor completely for the subsequent values.

