Re: delta compression I-D 01 from Willy Tarreau on 2013-03-17 (ietf-http-wg@w3.org from January to March 2013)

From: Willy Tarreau <w@1wt.eu>
Date: Sun, 17 Mar 2013 08:17:44 +0100
To: Roberto Peon <grmocg@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20130317071744.GG9060@1wt.eu>

Hi Roberto,

On Wed, Mar 13, 2013 at 12:28:53PM -0700, Roberto Peon wrote:
> I posted an update to the delta header compression I-D yesterday (but
> failed to put the tags on it that automatically marked it for HTTPbis.
> *sigh*).
> 
> http://datatracker.ietf.org/doc/draft-rpeon-httpbis-header-compression/

Great work, really !

I noticed this :
                  ('authorizations', ''),

I expect that you wanted to match the Authorization header instead, so
the trailing "s" should be removed.

I also have a few comments :

1) it's unclear to me how we can conserve key-value ordering in multiple
   key-value pairs. I mean, let's say we first have this :

   X-List: value1, value2, value3

Which is equivalent to :

   X-List: value1
   X-List: value2
   X-List: value3

It is then encoded as :

   [ ("X-List", "value1"),
     ("X-List", "value2"),
     ("X-List", "value3") ]

and stored as is on the recipient.

Then the list is changed to become :

   X-List: value4, value2, value3

So I guess the sender emits stoggl("X-List", "value1") then
skvsto("X-List", "value4"). But then I fear that X-List once
decoded will look like this :

   X-List: value2, value3, value4

which is different. Think about X-Forwarded-For across multiple
reverse-proxies where only the first value changes since it holds
the client's IP address. I think I'm missing something. Or perhaps
we should add opcodes to replace a specific key-value pair. 

2) I'm still seeing lots of zeroes in the output stream so I think
   we could compress it a bit better. From what I'm seeing in the
   pseudo-code, these two bytes are responsible for these "holes" :

       opcode = header_block.read_uint8()
       num_fields = header_block.read_uint8()

Thus I suggest that we could send a single byte with (opcode << 4) +
num_fields when numfields <  15 and that we only use the additional
byte for values 15 and above. It could then be done like this :

       byte       = header_block.read_uint8();
       opcode     = byte & 0xF;
       num_fields = byte >> 4;
       if (num_fields == 0xF)
           num_fields += header_block.read_uint8();

       byte       = header_block.read_uint8();
       opcode     = byte >> 4;
       num_fields = byte & 0xF;
       if (num_fields == 0xF)
           num_fields += header_block.read_uint8();

This way we would fall from two to one byte per opcode for most opcodes.

Last point, I like the static huffman table. It should allow most
decoders to automatically generate code to decode it without any
loop and produce an optimal branch prediction pattern for the CPU.

Cheers,
Willy

Received on Sunday, 17 March 2013 07:18:17 UTC