- From: Willy Tarreau <w@1wt.eu>
- Date: Sat, 13 Oct 2012 00:53:19 +0200
- To: Roberto Peon <grmocg@gmail.com>
- Cc: Poul-Henning Kamp <phk@phk.freebsd.dk>, James M Snell <jasnell@gmail.com>, ietf-http-wg@w3.org
Hi Roberto, On Fri, Oct 12, 2012 at 02:49:20PM -0700, Roberto Peon wrote: > The most recent output, copy/pasted is: > > "Delta-coding took: 0.642199 seconds for: 104300 header frames or > 6.15723e-06 per header or 162411 headers/sec or 8.88429e+07 bytes/sec" > > So, ~89 million bytes/second and 162k requests/second for the delta-coding > on one core. It does not seem bad, and I also know that it's hard to compare numbers. I like to count in terms of bytes per second or headers per second, but obviously it depends on the coding scheme. Right now I made a test on haproxy using a request to pinterest.com that I captured from Firefox 13 (didn't know the site so I tried it). The request looks like this, it's 282 bytes long and has 7 header fields : GET / HTTP/1.1 Host: pinterest.com User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20100101 Firefox/13.0.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive It was running on a single core of a core i7 2600 @3.4 GHz. I sent it 4 million times to haproxy which sent a redirect on them, and it took 3.609 seconds for the 4 million reqs, which is 1.1 million req/s, which is also exactly the same number as it reports in the stats, and 312 MB/s. So at first glance it's 3.5-7 times faster on a single core than the compressor alone. So this would mean that it would spend 88% of its CPU time in the compressor alone, and the 12% remaining doing its job. I understand the code is not optimized yet, but this typically is the type of thing I want us to be extremely careful about, because it's very easy to completely kill performance for the last percent of optimization over the wire. In fact I'm not that much worried for the 1.1-to-2.0 conversion because as time goes, the need for this work will fade away and won't represent most of the CPU usage. But routing and processing 2.0 to 2.0 should be optimally fast. > Given that most machines have more than once core, this > performance seems pretty reasonable to me, especially given that it is not > really been optimized. I'd hope that bandwidth was predominantly > entity-bodies and not headers, and so the bandwidth doesn't seem > problematic to me either. Unfortunately my experience has often been that most of the requests are if-modified-since and most of the responses are 304 on a number of web sites, so we still need to be careful. Thanks for sharing your data BTW ! Willy
Received on Friday, 12 October 2012 22:53:47 UTC