- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Thu, 22 Aug 96 13:50:12 MDT
- To: "Roy T. Fielding" <fielding@liege.ICS.UCI.EDU>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
First off, this just isn't an accurate comparison of "safe", which by definition is the potential for harm (to people or property) [Leveson's paper on Software Safety has the complete definition]. It is absolutely and provably false that overestimation is safe. Overestimation creates the potential for cache misses where there should be cache hits, and cache misses result in additional network requests, that in turn cost some people real money and have the potential to completely block an already congested network at periods of peak usage (as witnessed with the USA-UK link last year). In contrast, there is no direct correlation between presenting a stale entity to a user and causing harm. Any entity that is marked as cachable will not be dependent on the user ALWAYS seeing it when fresh -- servers cannot make such critical entities cachable unless they know that there are no older caches in the loop, and your rationale is only applicable when there are older caches in the loop. Roy and I have disagreed in public for some time about what is "safe". By now, it's clear that we will never agree. And we both believe that the other is using a completely bogus definition. My definition "safe", as it applies to caching, is that there must never be situation where a stale result is presented to a user as if it were fresh. Roy's definition of "safe" appears to be "don't waste any opportunities for caching, because this could cause additional network traffic. I'd question his use of the term "catastrophic network failure", especially since we've seen ample evidence on this mailing list during the past week or so that most HTTP caches would be hard-pressed to get hit rates above 40% to 50%, even if they were to cache EVERYTHING that didn't have a question mark or "cgi-bin" in the URL. Since I'm not going to change Roy's mind, I'll limit myself to correcting the factual mistakes in his message, and I won't continue to rehash the "safety" argument. But you are completely overlooking what happens if ANY of the intermediaries has a system clock which is out-of-sync with the origin. If ANY of them do have a bad clock, they will pass the additional bogus age on to every single request that passes through the proxy. The algorithm in section 13.2.3 is specifically designed to handle the "one bad clock" (or even "N bad clocks") case. It does this using the following two steps: apparent_age = max(0, response_time - date_value); corrected_received_age = max(apparent_age, age_value); (see the draft for the definitions of the variables). Because of the two max() operations, this will NEVER result in underestimating the Age, as long as at least one of the clocks in the chain is correct. Yes, it may overestimate the Age value, if one or more of the clocks is badly out of sync. If you agree with Roy that overestimation is evil, then I suppose the word "bogus" is appropriate here. Otherwise, it's exactly what the algorithm is intended to do. -Jeff
Received on Thursday, 22 August 1996 14:03:44 UTC