- From: Koen Holtman <koen@win.tue.nl>
- Date: Fri, 9 Aug 1996 01:13:11 +0200 (MET DST)
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
The benefits of reusing request headers in persistent ---------------------------------------------------- HTTP connections: A statistical analysis. ----------------------------------------- Oct 31, 1995 Koen Holtman, koen@win.tue.nl 1. INTRODUCTION --------------- When sending HTTP request over a persistent (keep-alive) HTTP connection, it would be possible to re-use request headers from earlier requests in subsequent requests. For example, if the User-agent header for requests n and n+1 are the same, there would be no need to send the header twice, a special request header (using less bytes) could indicate that the User-agent header is to be reused. Roy Fielding recently proposed a mechanism allowing such reuse. The question is whether designing and implementing such a mechanism would be a good move. For: - less HTTP traffic - faster browsing response time Against: - more software complexity - time spent in design and implementation cannot be used for making other improvements I have made some statistics about the size of the gains. 2. CONCLUSION ------------- My conclusion is that the gains are too small to bother about request header reuse at this point: - HTTP traffic savings would be about 1.3% - speedup of browsing response time would be minimal: page+inline loading times would be noticeably faster in about 17% of all cases. Much higher gain/effort ratios can be had by focusing on other desirable features of future HTTP software, for example - (general) support for `Content-Encoding: gzip' - support for sending .jpg inlines instead of .gif inlines to all browsers that can handle .jpg - reducing the amount of Accept headers generated by some browsers (my Mosaic for X browser sends 822 bytes of accept headers, most of them for MIME types I can't even view!), maybe introducing a mechanism for reactive content negotiation at the same time. - proxies that change multiple Accept headers in a request into one big Accept header when relaying the request I therefore propose to drop the subject of request header reuse on http-wg. Header reuse mechanisms would only get interesting again if we find some good reason to make the average request message much larger (say 500 bytes) than it needs to be now (200 bytes). (End of conclusions.) Yes, you can stop reading now! You can also page to Section 6, which contains some statistics about the number of requests done over persistent connections. 3. HOW LARGE DO REQUEST MESSAGES NEED TO BE? -------------------------------------------- 3.1 CURRENT ACCEPT HEADER PRACTICE ----------------------------------- I captured the request headers sent by the three browsers present on my Linux box. A typical Mozilla/1.12 (X11) GET request message for a normal URL: --------------------------------------------------- GET /blah/blebber/blex.html HTTP/1.0 User-Agent: Mozilla/1.12 (X11; I; Linux 1.2.9 i486) Referer: http://localhost/blah/blebber/wuxta.html Accept: */* Accept: image/gif Accept: image/x-xbitmap Accept: image/jpeg --------------------------------------------------- When GETting URL contents for inline images, Mozilla omits the `Accept: */*' header above. Note that the four Accept headers above could be combined into a single Accept header: Accept: */* image/gif image/x-xbitmap image/jpeg . None of the three browsers on my Linux system do such combining, though it would make the request message shorter (see also the table below). Is there some ancient HTTP server, not supporting multi-element Accept headers, they want to stay compatible to? Here is a table of typical GET request message sizes for the browsers on my Linux system: -----------------------+---+---+-----+---- Browser Len Acc (Ac1) Rest -----------------------+---+---+-----+---- NCSA Mosaic for X/2.2 995 882 (299) 113 Lynx/2.3 BETA 349 248 (100) 101 Mozilla/1.12 (normal) 207 73 (36) 134 Mozilla/1.12 (inline) 194 61 (34) 133 -----------------------+---+---+-----+---- Len : #bytes in request message Acc : #bytes in the Accept headers (Ac1): #bytes that would be in an equivalent single-line Accept header Rest : #bytes in non-Accept headers and first line of request 3.2 LACK OF NEED FOR LARGE ACCEPT HEADERS ----------------------------------------- In current practice on the Web, 99% of all URLs (if not more) only have one content variant, so the Accept headers contained in a request are almost never used. It is unlikely that this will change in the future. Thus, there is no good reason for tacking large Accept headers onto a request, now or in the future. An accept header larger than Accept: */* image/gif image/x-xbitmap image/jpeg is wasteful, the small number of cases case not covered by the header above could be solved by reactive content negotiation (300 and 406 responses). Note that, if a browser discovers it is doing a lot of reactive content negotiation to a site, it could dynamically make its Accept headers to that site larger to reduce future reactive negotiation. So sending large Accept headers may be efficient sometimes, but not by default. I see the large default Accept header problem as a problem that will disappear with browser upgrades in the near future, after a reactive negotiation mechanism has been defined. 4. STATISTICS ------------- To make the statistics below, I took a set of proxy<->server HTTP transactions between the www.win.tue.nl proxy and off-campus servers (18 days worth of traffic, approximately 150Mb in 14501 HTTP transactions), and calculated what would happen if these transactions were all done over persistent HTTP connections. If a simulated persistent connection has been idle for 10 minutes, it is closed. 4.1 HEADER SIZES ---------------- Working from the reasoning above, I take the following request message, generated by Mozilla, as typical. --------------------------------------------------- GET /blah/blebber/blex.html HTTP/1.0 User-Agent: Mozilla/1.12 (X11; I; Linux 1.2.9 i486) Referer: http://localhost/blah/blebber/wuxta.html Accept: */* Accept: image/gif Accept: image/x-xbitmap Accept: image/jpeg --------------------------------------------------- Every header in this message could potentially be reused in future requests. Only the `GET' line will always be different. I will use the following figures in the statistics below: - Without header reuse, the average request size is 200 bytes - With header reuse, the average request size is - 200 bytes for the first request over a persistent connection - 40 bytes for all subsequent requests over a persistent connection - The average size of the response headers is always 180 bytes. 4.2 RESULTS ----------- 4.2.1 Size of HTTP traffic transmitted. in response in bodies headers total ---------------------+------------+---------+---------------- Without header reuse: 145 Mb 5.3 Mb 150.3 Mb (100.0%) With header reuse: 145 Mb 3.3 Mb 148.3 Mb ( 98.7%) Reuse saves: 2.0 Mb ( 1.3%) Compared to other possible savings, 1.3% is too little to care about. But traffic size counts are dominated by very large requests: maybe we can get a noticeably faster response time on small requests? 4.2.2. Response time I use the following approximations for getting response time results: - The sequence of requests done over each persistent HTTP connection is divided into `wait chains'. - Each subsequent request in a `wait chain' is no more than 20 seconds apart. - the idea is that the user does not perceive the speedup of individual HTTP transactions in a `wait chain', but only the average transaction speedup for the whole `wait chain'. - We want to determine the percentage of wait chains that get noticeably faster after the introduction of header reuse. - We assume that for a wait chain to get noticeably faster, the HTTP traffic size generated in that wait chain must decrease with at least 10%. Amount of wait chains with a certain percentage of traffic decrease: decrease % amount -----------+------------- 0 1069 24% 1-4 1763 39% 5-9 926 21% 10-19 396 9% 20-49 278 6% 50- 70 2% Thus, request header reuse will lead to a noticeable speedup for 17% of all wait chains. 5. ALTERNATIVE 500 BYTE SCENARIO -------------------------------- The above statistics assume that - Without header reuse, the average request size is 200 bytes - With header reuse, the average request size is - 200 bytes for the first request over a persistent connection - 40 bytes for all subsequent requests over a persistent connection The reasons for these assumptions are given in Section 3. One could imagine an alternative scenario, in which we have a good (or bad) reason to make the requests much larger. To see if introducing header reuse is a good idea under such a scenario, I made the above statistics again with the following assumptions: - Without header reuse, the average request size is 500 bytes - With header reuse, the average request size is - 500 bytes for the first request over a persistent connection - 40 bytes for all subsequent requests over a persistent connection This gets us: 5.1.1 Size of HTTP traffic transmitted in 500 byte scenario in response in bodies headers total ---------------------+------------+---------+---------------- Without header reuse: 145 Mb 9.4 Mb 154.4 Mb (100.0%) With header reuse: 145 Mb 3.7 Mb 148.7 Mb ( 96.3%) Reuse saves: 5.7 Mb ( 3.7%) 5.1.2 Response time in 500 byte scenario Amount of wait chains with a certain percentage of traffic decrease: decrease % amount -----------+------------ 0 809 18% 1-4 884 20% 5-9 749 17% 10-19 980 22% 20-49 718 16% 50- 362 8% Thus, request header reuse will lead to a noticeable speedup for 46% of all wait chains. I conclude that header reuse becomes moderately interesting _IF_ we find a good reason use request messages which contain a large (>460 bytes) amount of reusable headers. 5.1.2 Comparison between Section 4 and 500 byte scenario --------------------------------------------------------- Traffic generated: in response in bodies headers -------------------------------+------------+--------- Section 4 without header reuse: 145 Mb 5.3 Mb Section 4 with header reuse: 145 Mb 3.3 Mb 500 byte without header reuse: 145 Mb 9.4 Mb 500 byte with header reuse: 145 Mb 3.7 Mb Amount of wait chains with a certain percentage of traffic decrease, when going from Section 4 _without_ reuse to 500 byte _with_ reuse: decrease % amount -----------+------------ - -21 121 3% -20 - -11 189 4% -10 - -6 198 4% -5 - -1 442 10% 0 - 4 2088 46% 5 - 9 784 17% 10 - 19 356 8% 20 - 324 7% (7% of wait chains get noticeably slower, 15% get noticeably faster) 6. RANDOM STATISTICS ------------------- The statistics below are not very relevant for deciding about reuse, but they are nice to have anyway. Amount of proxy<->server responses with a certain response body size: body size (bytes) amount cumulative amount ------------------+------+----------------- 0-99 4% 4% 100-199 4% 8% 200-499 8% 16% 500-999 9% 25% 1000-1999 19% 44% 2000-4999 25% 69% 5000-9999 16% 85% 10000-19999 7% 92% 20000-49999 6% 97% 50000-99999 2% 99% 100000- 1% 100% Amount of persistent proxy<->server connections over which a certain number of HTTP transactions are made (the connections have a timeout of 10 minutes): - on average, one persistent connection gets 9.2 transactions. # of transactions amount cumulative amount ------------------+-----------+----------------- 1 415 26% 26% 2 214 14% 40% 3 169 11% 50% 4 118 7% 58% 5-6 148 9% 67% 7-9 134 8% 76% 10-19 198 12% 88% 20-49 139 9% 97% 50- 49 3% 100% (End of document.)
Received on Thursday, 8 August 1996 16:18:38 UTC