- From: Johnny Graettinger <jgraettinger@chromium.org>
- Date: Tue, 10 Jun 2014 15:46:40 -0400
- To: Martin Thomson <martin.thomson@gmail.com>
- Cc: Martin Nilsson <nilsson@opera.com>, HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAEn92TqOSEJTjytRknS-=ZBE0_S-xNiVokG_=6ik3EXxAU+P=w@mail.gmail.com>
All, As discussed during the interim, Chrome has been running a trial to estimate an optimal HPACK Huffman code. Briefly, the methodology used is to maintain an LRU of HPACK delta encoders for visited origins, to process all request and response headers (including those delivered via HTTP/1) through an origin-specific encoder, and to aggregate character counts of literals emitted by those encoders. This roughly approximates the behavior of an HPACK encoder if the origin had switched to long-lived HTTP/2 connections. The implementation is available for inspection [1], and I'm happy to answer questions. Reviewing the last month of data, a Huffman code constructed for the count PDF compresses that same PDF about 2.1% better than the current code table, after adjusting the code to remove unique lengths (75.75% vs 77.87% of uncompressed size). The observed PDF, constructed code, and comparison with the current code are available at [2]. A caveat of the result is that this trial is running only on our Canary & Dev channels. It's possible the distribution is somewhat biased due to these populations. Ideally it would include Stable channel samples as well, but that wasn't feasible in the time available. That said, this is still a large population and having also looked at a couple of smaller windows, the distribution does appear to have converged quickly and to be internally consistent over time. I'm personally on the fence as to whether the degree of improvement warrants updating the current code, given the limitations of the trial. A reasonable position is that this simply provides confirmation the current code is near-optimal. [1] https://code.google.com/p/chromium/codesearch#chromium/src/net/spdy/hpack_huffman_aggregator.h&sq=package:chromium [2] https://docs.google.com/a/chromium.org/spreadsheet/ccc?key=0Ao3snhnDWuTvdGJNckVoWGpMN0tobTRERmVLdV8zRWc#gid=0 cheers, -johnny On Tue, Jun 10, 2014 at 11:08 AM, Martin Thomson <martin.thomson@gmail.com> wrote: > It costs nothing to disable Huffman coding. Thus the table really isn't > the issue. The actual problem here is the grammar of the various header > fields, and the potential need to translate into HTTP/1.1. > On Jun 10, 2014 7:49 AM, "Martin Nilsson" <nilsson@opera.com> wrote: > >> >> Regarding the process to validate the current hpack huffman codes against >> a large set of real headers, I think there is a risk that we'll paint >> ourselves into a corner dictated by how HTTP/1 looks like. As pointed out, >> a lot of base64 or hex encoded headers greatly benefits from huffman >> encoding. However, if we can carry binary data there is no point in having >> the data encoded in the first place, and not something to train the code >> table for. These headers will change over time, because even if they are >> taken into consideration for the huffman table, it is still more space >> efficient to not encode them. The code lengths for the characters of the >> other headers might suffer though. >> >> /Martin Nilsson >> >> -- >> Using Opera's revolutionary email client: http://www.opera.com/mail/ >> >>
Received on Tuesday, 10 June 2014 19:47:08 UTC