- From: Martin Hamilton <martin@mrrl.lut.ac.uk>
- Date: Sat, 03 Aug 1996 18:22:15 +0100
- To: Luigi Rizzo <luigi@labinfo.iet.unipi.it>
- Cc: Erik Aronesty <earonesty@montgomery.com>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Luigi Rizzo writes: | While the idea is sound, some time ago I did some measure on our | proxy cache: out of ~300 MB of files in the cache, only about 7MB | were duplicates with a different name. I only considered files with | the same size (after stripping metadata), so I might have missed | something, say text files with different end-of-line conventions; | also, this test should really be repeated on a larger data set. | Anyways, I am not very convinced that the saving are worth the | effort of handling multiple headers for the same object (while I | was *before* doing this test). I got curious about this a little while back, and wrote a little Perl program to calculate MD5 checksums of the objects in our (local/regional ?) cache, so we could see how many were dups. The results weren't very encouraging... <URL:http://www.roads.lut.ac.uk/lis ts/ircache/0202.html> Martin
Received on Saturday, 3 August 1996 10:25:04 UTC