Reference set in HPACK

Hi,

As you may remember, I implemented several HPACK *encoding* algorithms
and calculated compression ratio. I tried it again based on HPACK
08. I have 8 algorithms.

- Naive    -- No compression
- Naive-H  -- Using Huffman only
- Static   -- Using static table only
- Static-H -- Using static table and Huffman
- Linear   -- Using header table
- Linear-H -- Using header table and Huffman
- Diff     -- Using header table and reference set
- Diff-H   -- Using header table, reference set and Huffman

The implementations above pass all test cases in
https://github.com/http2jp/hpack-test-case/.  Using this test cases as
input, I calculated compression ratio again. The ratio is calculated
by dividing the number of bytes after compression by that before
compression.

Here is results:

Naive     1.10
Naive-H   0.86
Static    0.84
Static-H  0.66
Linear    0.39 
Linear-H  0.31
Diff      0.39
Diff-H    0.31

Linear-H and Diff-H results in almost the same. To my calculation,
Diff-H is only 1.6 byte shorter than Linear-H in average. This means
that reference set does NOT much contribute to compress headers
although it is very difficult to implement.

I have NOT seen any header examples for which reference set work
effectively so far.

So, if the authors of HPACK want to retain reference set, I would like
to see evidence that there are some cases in which reference set
contributes the compression ratio. HPACK 08 says "Updated Huffman
table, using data set provided by Google". So, I guess that the
authors can calculate the compression ratio based on this data.

If there is not such an evidence, I would like to strongly recommend
to remove reference set from HPACK. This makes HPACK much simpler, so
implementations gets bug less and inter-operability is improved. Plus,
the order of headers is reserved always.

Regards,

--Kazu

Received on Wednesday, 2 July 2014 05:30:46 UTC