- From: Mark Nottingham <mnot@mnot.net>
- Date: Mon, 27 Oct 2014 13:48:31 -0700
- To: Willy Tarreau <w@1wt.eu>, HTTP Working Group <ietf-http-wg@w3.org>
Thanks for that, Willy. Asking the question directly — of the people who -1’d making changes for #578, have these numbers changed your mind? Cheers, > On 24 Oct 2014, at 11:56 am, Willy Tarreau <w@1wt.eu> wrote: > > Hi Mark, > > On Fri, Oct 24, 2014 at 09:33:12PM +1100, Mark Nottingham wrote: >> Toy up at: >> https://gist.github.com/mnot/434ab029a6e878b2af4c > > Thank you, I could use it. I noticed that the random names it produces > can sometimes be used as a custom fixed header, sometimes as a custom > random header. It's no big deal, but I think it does not accurately > model reality since we'd rather have some fixed headers (eg: customer > name) and some always random ones (eg: signature, timestamp). I also > thought that we could have a few partially random values (those who > change from time to time such as x-forwarded-for behind a proxy), but > I don't think it will change things a lot anyway. > > So in turn I have worked today :-) > > I implemented a simple encoder which parses your program's output and > emits statistics on the output data. It does not emit the output bytes, > it just performs the encoding and counts. It's almost nothing to add, > it is just that I had no use for the output. > > It supports 4 encodings : > - draft 09 > - the proposal I sent that was called "option 3" > - the proposed revision I sent just after it > - Greg's proposed revision > > It reports various statistics such as number of strings encoded, number > of integers encoded, average integer size etc... I have run some tests > all on the same output from your program, and got interesting findings > already : > > Draft-09 : > Total input bytes : 7455384 > Total output bytes : 2318395 (100%) > Overall compression ratio : 0.310969 (100%) > Total encoded integers: 218865 > Total encoded integers bytes: 295036 (100%) > Avg bytes per integers: 1.348027 (100%) > > option3 : > Total input bytes : 7455384 > Total output bytes : 2268350 (97.84%) > Overall compression ratio : 0.304257 (97.84%) > Total encoded integers: 218865 > Total encoded integers bytes: 244991 (83.03%) > Avg bytes per integers: 1.119370 (83.03%) > > revised option3 : > Total input bytes : 7455384 > Total output bytes : 2264722 (97.68%) > Overall compression ratio : 0.303770 (97.68%) > Total encoded integers: 218865 > Total encoded integers bytes: 241363 (81.81%) > Avg bytes per integers: 1.102794 (81.81%) > > Greg's revision : > Total input bytes : 7455384 > Total output bytes : 2280713 (98.37%) > Overall compression ratio : 0.305915 (98.37%) > Total encoded integers: 218865 > Total encoded integers bytes: 257354 (87.23%) > Avg bytes per integers: 1.175857 (87.23%) > > First, the overall compression ratio is never exceptional given that the > input contains a significant amount of random data, so that's expected. > Second, we observe that the integer encoding is 17-18% smaller compared > to draft-09. And if we consider the integer encoding's overhead, then it > is even divided by 3.4 (0.34 byte to 0.10 byte per integer). > > The overall savings are 2.1% for "option 3", 2.3% for its revision, and > 1.7% for Greg's proposal. To my initial surprise, Greg's proposal provides > less savings here despite being balanced. But in the end there's a reason, > it offers more bits to literals while it's the case where we already have > to pay for the literal overhead so the occasional saving of 1-byte doesn't > save much. > > I have experimented with an option in the code to write fully random headers > as literal-without-indexing (as the producer would do, but not a gateway > which doesn't know which ones are stable and which ones are not). And while > doing so improves the compression ratio, the offset from draft-09 and the > other ones does not change. > > I have not yet tried to modify your program to vary the output between a > browser (less custom) or a partner site (more custom). But I wanted to share > these results already as I think they can be helpful. > > All the code is available here : > > https://github.com/wtarreau/http2-exp > > The readme is ugly when parsed as md, I've never written md docs so it > seems I'm lacking some basic practice here. But I'm sure nobody will > care, reading it in the console or as raw is OK. > > Ah, there's also a debug mode which indicates what encoding is chosen for > each field and how long the resulting sequence is. It helped me debug it, > and I found it useful to understand how the table evolves. > > Comments welcome. It's my first HPACK encoder, it's very possible that I > messed up a lot with certain things, though I didn't notice that. In any > case, feel free to comment/fork/fix/etc. > > Best regards, > Willy > -- Mark Nottingham http://www.mnot.net/
Received on Monday, 27 October 2014 20:48:57 UTC