- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Wed, 14 Aug 96 16:50:25 MDT
- To: Koen Holtman <koen@win.tue.nl>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
However, you have handed out 80*10=1000 uses, which gives you 800 hits as the upper bound. So all you know is: 80 <= actual hits <= 800 This is not what I call useful information. Something like an interesting upper bound would be 80 <= actual hits <= 100 but I see no way in which max-uses can provide such a bound. I suspect that max-uses counts higher than 3 will be disastrously ineffective at yielding a useful upper bound if uncooperative caches are common. A proxy not being cooperative and only supporting max-uses seems about as bad as a proxy not supporting hit counts at all. If I understand your argument, it is that in order to bound the size of the error in the hit count to lie within a reasonable range, the max-uses setting would have to be so small that it would effectively disable caching. I'd like to see *actual statistics* disprove my argument So I got a day's worth of log entries from our proxy. Here are some statistics: 589705 total log entries 529756 after removing non-HTTP URLs with "?", "cgi", or "htbin" 245481 unique "cachable" URLs 189723 "cachable" URLs referenced only once during the trace 55758 "cachable" URLs referenced more than once That's an effective cache hit rate of about 23%, not counting things that can't be cached, and ignoring any misses that were caused by modifications to the resources. Supposing that, for each of the "cachable" URLs referenced more than once, the origin server sent max-uses=3. Of the 55758 "cachable" URLs referenced more than once 28951 (52%) were referenced exactly twice 9592 (17%) were referenced exactly 3 times Or in other words, of the 340033 references to "cachable" URLs referenced more than once 28951*2 + 9592*3 = 86678 of these references were to URLs referenced 2 or 3 times so 340033 - 86678 = 253355 of these references were to URLs referenced more than 3 times Now, assume that the servers had all sent max-uses=3 for these URLs. Then the first use of each of these URLs (55758 uses) plus every 4th use of each of the URLs referenced more than 3 times (roughly 253355/4 = 63339 uses) would have to be forwarded to the origin server. This means that 340033 - (63339 + 55758) 220936 uses would not have to be forwarded to the origin server, which comes out to about 37% of all the references logged. Now, it's quite true that not every server insists on demographics information, and so the actual number of references saved would presumably be lower. But this should give some idea of the magnitude of the possible savings, and I don't think it's insignificant. -Jeff
Received on Wednesday, 14 August 1996 17:01:02 UTC