- From: Roberto Peon <grmocg@gmail.com>
- Date: Tue, 10 Jun 2014 14:02:29 -0700
- To: Mike Bishop <Michael.Bishop@microsoft.com>
- Cc: Martin Thomson <martin.thomson@gmail.com>, Johnny Graettinger <jgraettinger@chromium.org>, Martin Nilsson <nilsson@opera.com>, HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAP+FsNfVTOySnPhVh98_18GLJghCnMEb1GAmq3Eq8obc4W8moA@mail.gmail.com>
IIRC, when I created the code to generate the current table, it was done using the same methodology as Johnny's-- a character is counted if it would have been emitted in a literal, and not otherwise. -=R On Tue, Jun 10, 2014 at 1:40 PM, Mike Bishop <Michael.Bishop@microsoft.com> wrote: > Yes, thanks for this, Johnny! > > Given that the difference is slight (less than 3% better), we would > support accepting this as confirmation that the current code is good and > leave things as they are, unless broader deployment shows that the Huffman > table is further off-base for common headers than this indicates. > > Martin, I'm going to throw out a hypothesis that probably can't be > confirmed without changing the deployed code.... The original table was > based on a sampling of HTTP headers, while this is based on a sample of > *post-HPACK* literals. That means headers which get optimized out by HPACK > (header table, reference set) don't get counted every time in Johnny's > model, where they probably did in the original analysis. The less-steep > highest-frequency characters (particularly 0/1) may be things that show up > frequently but don't change from request to request over the lifetime of a > connection. > > -----Original Message----- > From: Martin Thomson [mailto:martin.thomson@gmail.com] > Sent: Tuesday, June 10, 2014 1:28 PM > To: Johnny Graettinger > Cc: Martin Nilsson; HTTP Working Group > Subject: Re: hpack huffman codes > > On 10 June 2014 12:46, Johnny Graettinger <jgraettinger@chromium.org> > wrote: > > I'm personally on the fence as to whether the degree of improvement > > warrants updating the current code, given the limitations of the > > trial. A reasonable position is that this simply provides confirmation > > the current code is near-optimal. > > Thanks for doing this Johnny. This would seem to validate the original > table, though there are some interesting differences in lengths. It seems > like the new numbers show less of a steep drop from the high frequency > 0/1/2 characters and a slightly more gradual roll off in the mid range. > > My little Monte Carlo estimation rig seems to produce better results with > common character set restrictions (ignore percentages): > > ~/code/bhpack master > $ node compare.js 1000000 40 > 1000000 random sequences of lengths: 40 > Average sizes: min/ave/max (size compared against Base64+Huffman) > Raw Huffman coding (invalid): 62/95.944842/123 (203.71%) > Base 64 (no Huffman): 54/54/54 (114.65%) > Base 64 with Huffman: 41/47.099388/54 (100%) > bhpack cookie with Huffman: 40/46.334863/56 (98.38%) > bhpack URI safe with Huffman: 40/45.126209/52 (95.81%) > bhpack URI query with Huffman: 40/45.200022/53 (95.97%) > > ~/code/bhpack chromium-test > $ node compare.js 1000000 40 > 1000000 random sequences of lengths: 40 > Average sizes: min/ave/max (size compared against Base64+Huffman) > Raw Huffman coding (invalid): 60/91.417749/123 (209.74%) > Base 64 (no Huffman): 54/54/54 (123.89%) > Base 64 with Huffman: 40/43.587016/47 (100%) > bhpack cookie with Huffman: 39/44.91345/56 (103.04%) > bhpack URI safe with Huffman: 40/43.424513/48 (99.63%) > bhpack URI query with Huffman: 39/43.268/49 (99.27%) > >
Received on Tuesday, 10 June 2014 21:02:56 UTC