Re: caching ideas from Martin Hamilton on 1996-08-03 (ietf-http-wg@w3.org from July to September 1996)

From: Martin Hamilton <martin@mrrl.lut.ac.uk>
Date: Sat, 03 Aug 1996 18:22:15 +0100
To: Luigi Rizzo <luigi@labinfo.iet.unipi.it>
Cc: Erik Aronesty <earonesty@montgomery.com>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <199608031722.SAA22501@gizmo.lut.ac.uk>

Luigi Rizzo writes:

| While the idea is sound, some time ago I did some measure on our
| proxy cache: out of ~300 MB of files in the cache, only about 7MB
| were duplicates with a different name. I only considered files with
| the same size (after stripping metadata), so I might have missed
| something, say text files with different end-of-line conventions;
| also, this test should really be repeated on a larger data set.
| Anyways, I am not very convinced that the saving are worth the
| effort of handling multiple headers for the same object (while I
| was *before* doing this test).

I got curious about this a little while back, and wrote a little Perl 
program to calculate MD5 checksums of the objects in our 
(local/regional ?) cache, so we could see how many were dups.  The 
results weren't very encouraging...  <URL:http://www.roads.lut.ac.uk/lis
ts/ircache/0202.html>

Martin

Received on Saturday, 3 August 1996 10:25:04 UTC