JavaScript header compressor/decompressor updated to HPACK-03 from Fred Akalin on 2013-09-13 (ietf-http-wg@w3.org from July to September 2013)

From: Fred Akalin <akalin@google.com>
Date: Fri, 13 Sep 2013 14:51:10 -0700
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CANUYc_R4Sx6D6GHGEz26TjRGUHptcbzdwqCYozZCPFOc24WwpQ@mail.gmail.com>

Hey all,

I updated
http://akalin-chromium.github.io/httpbis-header-compression/compressor_test.html
to
implement the HPACK-03 draft. In particular, I tried to make it a complete
an implementation as possible, and I added copious comments and references
to the spec to make it easy to validate and understand.

The only thing I didn't implement is UTF-8 validation for header values.
Hopefully, the need for that will go away.

Some thoughts:

- There aren't any tests. I wanted to see how correct I can make the
implementation without them (which will be measured when the compliance
suite comes out). I'm sure there are bugs.

- I didn't try very hard to make the encoder smart, but I did try to make
it exercise all the opcodes.

- I found it quite helpful that the encoding context was precisely defined
(as a header table plus the reference set). However, I ultimately found it
better to encode the reference set as part of the header table (by having a
bit per entry) instead of having a separate data structure, since it
eliminates a bunch of logic to keep the indices in the two in sync. This
may have been obvious to some people, but not to me. I wonder if it's in
the scope of the spec to suggest this.

- I also found it helpful to have a 'touch' flag per entry since
encoding/decoding requires processing of the untouched subset of the
reference set.

- For encoding I also needed to keep track of the number of touches
(representing the number of times the entry would be explicitly emitted),
and I needed to make a distinction between no touches and 0 touches
(representing an implicit emission). This is to support duplicate headers,
which was tricky to get right.

- It would be nice to have explicit bounds for encoded integers, string
lengths, header lengths, etc. I didn't try to make the encoder/decoder
streaming, since that would complicate the implementation, but it seems
difficult to guarantee memory bounds without the above explicit bounds.

- It would be nice to clarify the behavior when the max header table size
is reduced. I just implemented popping from the front until the new bound
is satisfied.

- I didn't find the need to encode index vs. index + 1 too confusing this
time around. I feel like making the header table start at 1 would simply
move the off-by-one bugs someplace else. I don't feel too strongly about
this, though.

Comments, pull requests, etc. welcome!

-- Fred

Received on Friday, 13 September 2013 21:51:37 UTC