- From: Martin Hamilton <martin@mrrl.lut.ac.uk>
- Date: Sat, 10 Aug 1996 07:54:11 +0100
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Erik Aronesty writes: | A significant number of hits for certain documents | could have been reduced if the your proxy had reported a | document-hash to the client in the header. Yep! But: not nearly as many as I'd expected. What happened to all those Cindy Crawford pictures ? ;-) My take on this was to hack some very limited support for the "Content-MD5:" header into the Apache and NCSA HTTP servers, since the proxy doesn't really want to have to go to the trouble of calculating this sort of thing itself ? It already has quite a lot to do, and quickly! Using MD5 (or whatever) to check that you got what you asked for would require an MD5 calculation on the part of the proxy for each URL retrieved, which is likely to be a no-no for all but the most anally-retentive ? There was a discussion which led up to this, but as I recall it was split across a number of participants, private and public mail... Unfortunately, my feet haven't really touched the ground very much lately, and I haven't had the opportunity to sit down with the code again and make it "production strength" - if you look at the sources you'll see that it ships disabled by default. Phew! The world is saved from my lame attempts at C programming :-) I think the next step, and what's required to make this really work, is for the target HTTP servers themselves to generate and maintain a *cache* of checksums. Being very lazy, I'm inclined to do this by putting them in a hash database. A purpose-built in-memory cache would be faster, but feels like it would be quite painful to code up. There are a few nasties, like locking strategies on the cache when you have a pool of servers, but it's doable and if I don't get around to doing the extra work I'm sure somebody else will (eventually). A lazier-than-thou first step would be to have a separate process which went around generating the checksum cache periodically, so that the HTTP server itself doesn't need to be doing anything particularly clever. Loosely consistent! ObIETF: Is "Content-MD5:" the right way to go about this ? Should http-spec-v11-* note this use of MD5 ? What about other algorithms ? Martin PS In case it's not obvious - the rationale is that over time the proxy can automagically "learn" about replicated resources. So, you can take your URNs and stuff them up your...!
Received on Friday, 9 August 1996 23:57:54 UTC