RE: #578: getting real-ish numbers for option 3

+1 here, as well.  Interesting, but going from ~40% efficiency to ~36% efficiency doesn't feel like it's really moving the needle enough to respin.

-----Original Message-----
From: Kulkarni, Saurabh [mailto:sakulkar@akamai.com] 
Sent: Tuesday, October 28, 2014 10:22 AM
To: Nicholas Hurley; Mark Nottingham; Willy Tarreau; HTTP Working Group
Subject: Re: #578: getting real-ish numbers for option 3

+1 to Nickıs comment. Not interesting enough to warrant another
implementation draft.

- Saurabh

On 10/27/14, 5:05 PM, "Nicholas Hurley" <hurley@todesschaf.org> wrote:

>I'll chime in. Short answer - no, these have not changed my mind at all.
>If anything, they've reinforced my view.
>
>These numbers are, once again, along the same lines of other changes 
>we've already rejected as a working group - relatively miniscule 
>improvements that are not worth invalidating testing we have already 
>done. If this change improved the compression ratio significantly then 
>we might have something worth investigating in another implementation 
>draft. These numbers don't indicate any significant improvement in 
>compression ratio, and so don't warrant any further investigation.
>
>On Mon, Oct 27, 2014, at 13:48, Mark Nottingham wrote:
>> Thanks for that, Willy.
>> 
>> Asking the question directly < of the people who -1ıd making changes 
>> for #578, have these numbers changed your mind?
>> 
>> Cheers,
>> 
>> 
>> > On 24 Oct 2014, at 11:56 am, Willy Tarreau <w@1wt.eu> wrote:
>> > 
>> > Hi Mark,
>> > 
>> > On Fri, Oct 24, 2014 at 09:33:12PM +1100, Mark Nottingham wrote:
>> >> Toy up at:
>> >>  https://gist.github.com/mnot/434ab029a6e878b2af4c
>> > 
>> > Thank you, I could use it. I noticed that the random names it 
>> > produces can sometimes be used as a custom fixed header, sometimes 
>> > as a custom random header. It's no big deal, but I think it does 
>> > not accurately model reality since we'd rather have some fixed 
>> > headers (eg: customer
>> > name) and some always random ones (eg: signature, timestamp). I 
>> > also thought that we could have a few partially random values 
>> > (those who change from time to time such as x-forwarded-for behind 
>> > a proxy), but I don't think it will change things a lot anyway.
>> > 
>> > So in turn I have worked today :-)
>> > 
>> > I implemented a simple encoder which parses your program's output 
>> > and emits statistics on the output data. It does not emit the 
>> > output
>>bytes,
>> > it just performs the encoding and counts. It's almost nothing to 
>> > add, it is just that I had no use for the output.
>> > 
>> > It supports 4 encodings :
>> >  - draft 09
>> >  - the proposal I sent that was called "option 3"
>> >  - the proposed revision I sent just after it
>> >  - Greg's proposed revision
>> > 
>> > It reports various statistics such as number of strings encoded,
>>number
>> > of integers encoded, average integer size etc... I have run some 
>> > tests all on the same output from your program, and got interesting 
>> > findings already :
>> > 
>> > Draft-09 :
>> >        Total input bytes : 7455384
>> >        Total output bytes : 2318395            (100%)
>> >        Overall compression ratio : 0.310969    (100%)
>> >        Total encoded integers: 218865
>> >        Total encoded integers bytes: 295036    (100%)
>> >        Avg bytes per integers: 1.348027        (100%)
>> > 
>> > option3 :
>> >        Total input bytes : 7455384
>> >        Total output bytes : 2268350            (97.84%)
>> >        Overall compression ratio : 0.304257    (97.84%)
>> >        Total encoded integers: 218865
>> >        Total encoded integers bytes: 244991    (83.03%)
>> >        Avg bytes per integers: 1.119370        (83.03%)
>> > 
>> > revised option3 :
>> >        Total input bytes : 7455384
>> >        Total output bytes : 2264722            (97.68%)
>> >        Overall compression ratio : 0.303770    (97.68%)
>> >        Total encoded integers: 218865
>> >        Total encoded integers bytes: 241363    (81.81%)
>> >        Avg bytes per integers: 1.102794        (81.81%)
>> > 
>> > Greg's revision :
>> >        Total input bytes : 7455384
>> >        Total output bytes : 2280713            (98.37%)
>> >        Overall compression ratio : 0.305915    (98.37%)
>> >        Total encoded integers: 218865
>> >        Total encoded integers bytes: 257354    (87.23%)
>> >        Avg bytes per integers: 1.175857        (87.23%)
>> > 
>> > First, the overall compression ratio is never exceptional given 
>> > that
>>the
>> > input contains a significant amount of random data, so that's
>>expected.
>> > Second, we observe that the integer encoding is 17-18% smaller
>>compared
>> > to draft-09. And if we consider the integer encoding's overhead, 
>> > then
>>it
>> > is even divided by 3.4 (0.34 byte to 0.10 byte per integer).
>> > 
>> > The overall savings are 2.1% for "option 3", 2.3% for its revision,
>>and
>> > 1.7% for Greg's proposal. To my initial surprise, Greg's proposal
>>provides
>> > less savings here despite being balanced. But in the end there's a
>>reason,
>> > it offers more bits to literals while it's the case where we 
>> > already
>>have
>> > to pay for the literal overhead so the occasional saving of 1-byte
>>doesn't
>> > save much.
>> > 
>> > I have experimented with an option in the code to write fully 
>> > random
>>headers
>> > as literal-without-indexing (as the producer would do, but not a
>>gateway
>> > which doesn't know which ones are stable and which ones are not). 
>> > And
>>while
>> > doing so improves the compression ratio, the offset from draft-09 
>> > and
>>the
>> > other ones does not change.
>> > 
>> > I have not yet tried to modify your program to vary the output
>>between a
>> > browser (less custom) or a partner site (more custom). But I wanted
>>to share
>> > these results already as I think they can be helpful.
>> > 
>> > All the code is available here :
>> > 
>> >  https://github.com/wtarreau/http2-exp
>> > 
>> > The readme is ugly when parsed as md, I've never written md docs so 
>> > it seems I'm lacking some basic practice here. But I'm sure nobody 
>> > will care, reading it in the console or as raw is OK.
>> > 
>> > Ah, there's also a debug mode which indicates what encoding is 
>> > chosen
>>for
>> > each field and how long the resulting sequence is. It helped me 
>> > debug
>>it,
>> > and I found it useful to understand how the table evolves.
>> > 
>> > Comments welcome. It's my first HPACK encoder, it's very possible
>>that I
>> > messed up a lot with certain things, though I didn't notice that. 
>> > In
>>any
>> > case, feel free to comment/fork/fix/etc.
>> > 
>> > Best regards,
>> > Willy
>> > 
>> 
>> --
>> Mark Nottingham   http://www.mnot.net/
>> 
>> 
>> 
>> 
>
>
>--
>Peace,
>  -Nick
>

Received on Tuesday, 28 October 2014 20:08:59 UTC