Re: First cut of Huffman encoding in compression document.

I've closed the issue, though we can reopen. Please check the literal
encoding section in the header compression document (in github). I hope
this clarifies things quite a bit.


On Thu, Oct 17, 2013 at 2:15 PM, Fred Akalin <akalin@google.com> wrote:

> Initial comments:
>
> - I may be missing something, but I'm not sure why we need a string
> literal to be both length-delimited and have an end marker. I'd prefer just
> having the length and assigning the short encoding for EOS to something
> else.
>

Since length is in bytes instead of bits, but huffman encoded things are
bit based, you need either to represent length in bits (which makes the
lengths bigger), or you need to ensure that the padding used to get to the
next byte boundary at the end of the string cannot be interpreted as a
valid huffman-code. What is in the document now is essentially what you've
requested. EOS is a code which is guaranteed to be 7 bits long or greater
and have no semantic meaning.



> - Do we gain that much by having separate tables for request and response?
> I was looking forward to not having to make a distinction between
> request/response contexts since we now have a single static table, but this
> separation blocks that again.
>

The tables are fairly different. You may want to experiment to see if it is
worth removing one. I still wonder if it is worthwhile to have a separate
one for cookies and the other for everything else (never did get around to
testing that).


> - I can see it being useful to encode both the Huffman-encoded length and
> the original length of the string (or the delta between them), so that
> buffers can be sized just once.
>

That would bloat the on-the-wire format a fair bit. I'm hoping that we end
up storing the huffman-encoded values instead of the actual bytes in many
cases in the future.

-=R

Received on Thursday, 17 October 2013 21:32:46 UTC