Re: Reference set in HPACK

Another advantage of removing reference set is that the order of header
fields is preserved.
This removes the needs for the following rule:
- section-8.1.2.3
      To preserve the order of multiple occurrences of a header field with
      the same name, its ordered values are concatenated into a single
      value using a zero-valued octet (0x0) to delimit them.

      After decompression, header fields that have values containing zero
      octets (0x0) MUST be split into multiple header fields before being
      processed.

Cookie now preserves the order of key-value pairs even if they are split
for "better-compression".  This might be important with regard to request
signing.

--
Kaoru Maeda


2014-07-02 19:52 GMT+09:00 Michael Sweet <msweet@apple.com>:

> Roberto,
>
> On Jul 2, 2014, at 1:39 AM, Roberto Peon <grmocg@gmail.com> wrote:
>
> You're basing conclusions on today's data, instead of looking forward as
> to what might happen when the set of headers sent adapts to the compression
> method, making it significantly more likely for items in the reference set
> to be emitted.
>
>
> Isn't that basically confirming what Kazu found: the reference set doesn't
> help with today's headers?
>
> Here is running code that demonstrates that the reference set does not
> contribute significantly to the performance of HPACK. Unless you can
> demonstrate a significant improvement from (simple) server/client changes,
> your assertion that things will improve doesn't have any evidence to
> support it.
>
> My observation is that the headers emitted by most web sites are not
> controlled by the web site developer, they will rely on the underlying web
> server and scripting engine (PHP, Perl, Python, Ruby, etc.) to do that.
>  The only header they generally do control is Set-Cookie, and then only for
> their own site (i.e. not for the advertising networks that are used).  What
> changes on the server side would be useful here to get the full benefit of
> the reference table?
>
> (And IMHO if we do have this information then it should be in the HPACK
> spec...)
>
>
>
>
> You may want to look at how many of those entries would be regularized if
> HPACK was in use and servers/clients intended on sending headers that were
> similar.
> -=R
>
>
> On Tue, Jul 1, 2014 at 10:30 PM, Kazu Yamamoto <kazu@iij.ad.jp> wrote:
>
>> Hi,
>>
>> As you may remember, I implemented several HPACK *encoding* algorithms
>> and calculated compression ratio. I tried it again based on HPACK
>> 08. I have 8 algorithms.
>>
>> - Naive    -- No compression
>> - Naive-H  -- Using Huffman only
>> - Static   -- Using static table only
>> - Static-H -- Using static table and Huffman
>> - Linear   -- Using header table
>> - Linear-H -- Using header table and Huffman
>> - Diff     -- Using header table and reference set
>> - Diff-H   -- Using header table, reference set and Huffman
>>
>> The implementations above pass all test cases in
>> https://github.com/http2jp/hpack-test-case/.  Using this test cases as
>> input, I calculated compression ratio again. The ratio is calculated
>> by dividing the number of bytes after compression by that before
>> compression.
>>
>> Here is results:
>>
>> Naive     1.10
>> Naive-H   0.86
>> Static    0.84
>> Static-H  0.66
>> Linear    0.39
>> Linear-H  0.31
>> Diff      0.39
>> Diff-H    0.31
>>
>> Linear-H and Diff-H results in almost the same. To my calculation,
>> Diff-H is only 1.6 byte shorter than Linear-H in average. This means
>> that reference set does NOT much contribute to compress headers
>> although it is very difficult to implement.
>>
>> I have NOT seen any header examples for which reference set work
>> effectively so far.
>>
>> So, if the authors of HPACK want to retain reference set, I would like
>> to see evidence that there are some cases in which reference set
>> contributes the compression ratio. HPACK 08 says "Updated Huffman
>> table, using data set provided by Google". So, I guess that the
>> authors can calculate the compression ratio based on this data.
>>
>> If there is not such an evidence, I would like to strongly recommend
>> to remove reference set from HPACK. This makes HPACK much simpler, so
>> implementations gets bug less and inter-operability is improved. Plus,
>> the order of headers is reserved always.
>>
>> Regards,
>>
>> --Kazu
>>
>>
>>
>>
>>
>>
>
> _________________________________________________________
> Michael Sweet, Senior Printing System Engineer, PWG Chair
>
>

Received on Wednesday, 2 July 2014 13:53:18 UTC