- From: Luigi Rizzo <luigi@labinfo.iet.unipi.it>
- Date: Tue, 22 Apr 1997 15:58:56 +0200 (MET DST)
- To: Fred Douglis <douglis@research.att.com>
- Cc: http-wg@cuckoo.hpl.hp.com
> > I have done a quick test on the content of our proxy cache: for each ... > > which is not a very rigorous test (since files in the cache contain the ... > > with a saving, due to compression, of approximately 13% . I suspect the > > actual use of compression would result in lower performance since > > most files are short and headers compress a lot, thus biasing my result > > toward better performance. These results can be explained with the fact > > that large matherial is generally in compressed form at the source ... > Another way to look at this is that not only is "large" textual data, such as > postscript, often compressed, but images are inherently compressed. Can you of course. Potentially large matherial is most of the times compressed (because of native format, or because the provider is trying to save bandwidth). > tell us what fraction of files in your cache are content-type image/* > (and the like) as opposed to text? couting them now, they are about 73% over 17000 files (2/3 of the cache, which is rather small) > An aside: does anyone know what the difference in compression will be between > cat * | gzip > and > for i in *; gzip $i ? > > My guess is that by glomming everything together you are getting better > compression than you would in practice, when each file is compressed > distinctly, due to the adaptive algorithms -- here you may use data from file > X to do a better job compressing Y. generally speaking, this is correct. In this specific case, however, I suspect that the advantages are only achieved on the http headers (which are stored with the body), since a large amount of data does not really compress. And for compressing headers there are probably more efficient ways (using tokens for the keywords, binary representation of dates times and numbers, etc. Cheers Luigi -----------------------------+-------------------------------------- Luigi Rizzo | Dip. di Ingegneria dell'Informazione email: luigi@iet.unipi.it | Universita' di Pisa tel: +39-50-568533 | via Diotisalvi 2, 56126 PISA (Italy) fax: +39-50-568522 | http://www.iet.unipi.it/~luigi/ _____________________________|______________________________________
Received on Tuesday, 22 April 1997 07:49:01 UTC