- From: Koen Holtman <koen@win.tue.nl>
- Date: Fri, 9 Aug 1996 01:13:11 +0200 (MET DST)
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
The benefits of reusing request headers in persistent
----------------------------------------------------
HTTP connections: A statistical analysis.
-----------------------------------------
Oct 31, 1995
Koen Holtman, koen@win.tue.nl
1. INTRODUCTION
---------------
When sending HTTP request over a persistent (keep-alive) HTTP
connection, it would be possible to re-use request headers from
earlier requests in subsequent requests. For example, if the
User-agent header for requests n and n+1 are the same, there would be
no need to send the header twice, a special request header (using less
bytes) could indicate that the User-agent header is to be reused.
Roy Fielding recently proposed a mechanism allowing such reuse. The
question is whether designing and implementing such a mechanism would
be a good move.
For: - less HTTP traffic
- faster browsing response time
Against: - more software complexity
- time spent in design and implementation cannot be
used for making other improvements
I have made some statistics about the size of the gains.
2. CONCLUSION
-------------
My conclusion is that the gains are too small to bother about request
header reuse at this point:
- HTTP traffic savings would be about 1.3%
- speedup of browsing response time would be minimal:
page+inline loading times would be noticeably faster in
about 17% of all cases.
Much higher gain/effort ratios can be had by focusing on other
desirable features of future HTTP software, for example
- (general) support for `Content-Encoding: gzip'
- support for sending .jpg inlines instead of .gif inlines to all
browsers that can handle .jpg
- reducing the amount of Accept headers generated by some browsers
(my Mosaic for X browser sends 822 bytes of accept headers, most of
them for MIME types I can't even view!), maybe introducing a
mechanism for reactive content negotiation at the same time.
- proxies that change multiple Accept headers in a request into one
big Accept header when relaying the request
I therefore propose to drop the subject of request header reuse on
http-wg.
Header reuse mechanisms would only get interesting again if we find
some good reason to make the average request message much larger (say
500 bytes) than it needs to be now (200 bytes).
(End of conclusions.)
Yes, you can stop reading now!
You can also page to Section 6, which contains some statistics about
the number of requests done over persistent connections.
3. HOW LARGE DO REQUEST MESSAGES NEED TO BE?
--------------------------------------------
3.1 CURRENT ACCEPT HEADER PRACTICE
-----------------------------------
I captured the request headers sent by the three browsers present on
my Linux box.
A typical Mozilla/1.12 (X11) GET request message for a normal URL:
---------------------------------------------------
GET /blah/blebber/blex.html HTTP/1.0
User-Agent: Mozilla/1.12 (X11; I; Linux 1.2.9 i486)
Referer: http://localhost/blah/blebber/wuxta.html
Accept: */*
Accept: image/gif
Accept: image/x-xbitmap
Accept: image/jpeg
---------------------------------------------------
When GETting URL contents for inline images, Mozilla omits the
`Accept: */*' header above.
Note that the four Accept headers above could be combined into a
single Accept header:
Accept: */* image/gif image/x-xbitmap image/jpeg .
None of the three browsers on my Linux system do such combining,
though it would make the request message shorter (see also the table
below). Is there some ancient HTTP server, not supporting
multi-element Accept headers, they want to stay compatible to?
Here is a table of typical GET request message sizes for the browsers
on my Linux system:
-----------------------+---+---+-----+----
Browser Len Acc (Ac1) Rest
-----------------------+---+---+-----+----
NCSA Mosaic for X/2.2 995 882 (299) 113
Lynx/2.3 BETA 349 248 (100) 101
Mozilla/1.12 (normal) 207 73 (36) 134
Mozilla/1.12 (inline) 194 61 (34) 133
-----------------------+---+---+-----+----
Len : #bytes in request message
Acc : #bytes in the Accept headers
(Ac1): #bytes that would be in an equivalent single-line Accept header
Rest : #bytes in non-Accept headers and first line of request
3.2 LACK OF NEED FOR LARGE ACCEPT HEADERS
-----------------------------------------
In current practice on the Web, 99% of all URLs (if not more) only
have one content variant, so the Accept headers contained in a request
are almost never used. It is unlikely that this will change in the
future.
Thus, there is no good reason for tacking large Accept headers onto a
request, now or in the future. An accept header larger than
Accept: */* image/gif image/x-xbitmap image/jpeg
is wasteful, the small number of cases case not covered by the header
above could be solved by reactive content negotiation (300 and 406
responses). Note that, if a browser discovers it is doing a lot of
reactive content negotiation to a site, it could dynamically make its
Accept headers to that site larger to reduce future reactive
negotiation. So sending large Accept headers may be efficient
sometimes, but not by default.
I see the large default Accept header problem as a problem that will
disappear with browser upgrades in the near future, after a reactive
negotiation mechanism has been defined.
4. STATISTICS
-------------
To make the statistics below, I took a set of proxy<->server HTTP
transactions between the www.win.tue.nl proxy and off-campus servers
(18 days worth of traffic, approximately 150Mb in 14501 HTTP
transactions), and calculated what would happen if these
transactions were all done over persistent HTTP connections.
If a simulated persistent connection has been idle for 10 minutes, it
is closed.
4.1 HEADER SIZES
----------------
Working from the reasoning above, I take the following request
message, generated by Mozilla, as typical.
---------------------------------------------------
GET /blah/blebber/blex.html HTTP/1.0
User-Agent: Mozilla/1.12 (X11; I; Linux 1.2.9 i486)
Referer: http://localhost/blah/blebber/wuxta.html
Accept: */*
Accept: image/gif
Accept: image/x-xbitmap
Accept: image/jpeg
---------------------------------------------------
Every header in this message could potentially be reused in future
requests. Only the `GET' line will always be different.
I will use the following figures in the statistics below:
- Without header reuse, the average request size is 200 bytes
- With header reuse, the average request size is
- 200 bytes for the first request over a persistent connection
- 40 bytes for all subsequent requests over a persistent connection
- The average size of the response headers is always 180 bytes.
4.2 RESULTS
-----------
4.2.1 Size of HTTP traffic transmitted.
in response in
bodies headers total
---------------------+------------+---------+----------------
Without header reuse: 145 Mb 5.3 Mb 150.3 Mb (100.0%)
With header reuse: 145 Mb 3.3 Mb 148.3 Mb ( 98.7%)
Reuse saves: 2.0 Mb ( 1.3%)
Compared to other possible savings, 1.3% is too little to care about.
But traffic size counts are dominated by very large requests: maybe we
can get a noticeably faster response time on small requests?
4.2.2. Response time
I use the following approximations for getting response time results:
- The sequence of requests done over each persistent HTTP connection
is divided into `wait chains'.
- Each subsequent request in a `wait chain' is no more than 20
seconds apart.
- the idea is that the user does not perceive the speedup of
individual HTTP transactions in a `wait chain', but only the
average transaction speedup for the whole `wait chain'.
- We want to determine the percentage of wait chains that get
noticeably faster after the introduction of header reuse.
- We assume that for a wait chain to get noticeably faster, the
HTTP traffic size generated in that wait chain must decrease
with at least 10%.
Amount of wait chains with a certain percentage of traffic decrease:
decrease % amount
-----------+-------------
0 1069 24%
1-4 1763 39%
5-9 926 21%
10-19 396 9%
20-49 278 6%
50- 70 2%
Thus, request header reuse will lead to a noticeable speedup for
17% of all wait chains.
5. ALTERNATIVE 500 BYTE SCENARIO
--------------------------------
The above statistics assume that
- Without header reuse, the average request size is 200 bytes
- With header reuse, the average request size is
- 200 bytes for the first request over a persistent connection
- 40 bytes for all subsequent requests over a persistent connection
The reasons for these assumptions are given in Section 3.
One could imagine an alternative scenario, in which we have a good (or
bad) reason to make the requests much larger. To see if introducing
header reuse is a good idea under such a scenario, I made the above
statistics again with the following assumptions:
- Without header reuse, the average request size is 500 bytes
- With header reuse, the average request size is
- 500 bytes for the first request over a persistent connection
- 40 bytes for all subsequent requests over a persistent connection
This gets us:
5.1.1 Size of HTTP traffic transmitted in 500 byte scenario
in response in
bodies headers total
---------------------+------------+---------+----------------
Without header reuse: 145 Mb 9.4 Mb 154.4 Mb (100.0%)
With header reuse: 145 Mb 3.7 Mb 148.7 Mb ( 96.3%)
Reuse saves: 5.7 Mb ( 3.7%)
5.1.2 Response time in 500 byte scenario
Amount of wait chains with a certain percentage of traffic decrease:
decrease % amount
-----------+------------
0 809 18%
1-4 884 20%
5-9 749 17%
10-19 980 22%
20-49 718 16%
50- 362 8%
Thus, request header reuse will lead to a noticeable speedup for 46% of
all wait chains.
I conclude that header reuse becomes moderately interesting _IF_ we
find a good reason use request messages which contain a large (>460
bytes) amount of reusable headers.
5.1.2 Comparison between Section 4 and 500 byte scenario
---------------------------------------------------------
Traffic generated:
in response in
bodies headers
-------------------------------+------------+---------
Section 4 without header reuse: 145 Mb 5.3 Mb
Section 4 with header reuse: 145 Mb 3.3 Mb
500 byte without header reuse: 145 Mb 9.4 Mb
500 byte with header reuse: 145 Mb 3.7 Mb
Amount of wait chains with a certain percentage of traffic decrease,
when going from Section 4 _without_ reuse to 500 byte _with_ reuse:
decrease % amount
-----------+------------
- -21 121 3%
-20 - -11 189 4%
-10 - -6 198 4%
-5 - -1 442 10%
0 - 4 2088 46%
5 - 9 784 17%
10 - 19 356 8%
20 - 324 7%
(7% of wait chains get noticeably slower, 15% get noticeably faster)
6. RANDOM STATISTICS
-------------------
The statistics below are not very relevant for deciding about reuse,
but they are nice to have anyway.
Amount of proxy<->server responses with a certain response body size:
body size (bytes) amount cumulative amount
------------------+------+-----------------
0-99 4% 4%
100-199 4% 8%
200-499 8% 16%
500-999 9% 25%
1000-1999 19% 44%
2000-4999 25% 69%
5000-9999 16% 85%
10000-19999 7% 92%
20000-49999 6% 97%
50000-99999 2% 99%
100000- 1% 100%
Amount of persistent proxy<->server connections over which a certain
number of HTTP transactions are made (the connections have a timeout
of 10 minutes):
- on average, one persistent connection gets 9.2 transactions.
# of transactions amount cumulative amount
------------------+-----------+-----------------
1 415 26% 26%
2 214 14% 40%
3 169 11% 50%
4 118 7% 58%
5-6 148 9% 67%
7-9 134 8% 76%
10-19 198 12% 88%
20-49 139 9% 97%
50- 49 3% 100%
(End of document.)
Received on Thursday, 8 August 1996 16:18:38 UTC