W3C home > Mailing lists > Public > ietf-http-wg-old@w3.org > January to April 1997

Re: Pipelining and compression effect on HTTP/1.1 proxies

From: Fred Douglis <douglis@research.att.com>
Date: Tue, 22 Apr 1997 09:58:39 -0400
Message-Id: <199704221358.JAA23099@raptor.research.att.com>
To: Luigi Rizzo <luigi@labinfo.iet.unipi.it>
Cc: http-wg@cuckoo.hpl.hp.com
> I have done a quick test on the content of our proxy cache: for each
> directory, I have compared the output of
> 
> 	cat * | wc
> and
> 	cat * | gzip | wc
> 
> which is not a very rigorous test (since files in the cache contain the
> HTTP header as well, and merging files before compression changes
> the results a little bit) but gives the idea.
> 
> The total byte count is as follows:
> 
> 	Uncompressed:	316.407.346
> 	Compressed:	274.892.797
> 
> with a saving, due to compression, of approximately 13% . I suspect the
> actual use of compression would result in lower performance since
> most files are short and headers compress a lot, thus biasing my result
> toward better performance. These results can be explained with the fact
> that large matherial is generally in compressed form at the source
> hence the additional compression is ineffective.


Another way to look at this is that not only is "large" textual data, such as 
postscript, often compressed, but images are inherently compressed.  Can you 
tell us what fraction of files in your cache are content-type image/* (and the 
like) as opposed to text?  

In any case, I agree with your conclusion, in the sense that no matter what 
the cause of the poor compression is, the end result is that compression will 
only do so much.

An aside: does anyone know what the difference in compression will be between
	cat * | gzip
and
	for i in *; gzip $i   ?

My guess is that by glomming everything together you are getting better 
compression than you would in practice, when each file is compressed 
distinctly, due to the adaptive algorithms -- here you may use data from file 
X to do a better job compressing Y.


-- 

Fred Douglis 		    MIME accepted	  douglis@research.att.com
AT&T Labs - Research				     908 582-3633 (office)
600 Mountain Ave., Rm. 2B-105 			        908 582-3063 (fax)
Murray Hill, NJ 07974                http://www.research.att.com/~douglis/

As of 6/1/97:
AT&T Labs - Research
180 Park Ave, Room A181
Florham Park, NJ 07932-0971
973-360-8775 (office)
973-360-8871 (fax)
Received on Tuesday, 22 April 1997 07:02:34 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 06:32:35 EDT