- From: Ari Luotonen <luotonen@netscape.com>
- Date: Fri, 29 Dec 1995 14:43:43 -0800 (PST)
- To: koen@win.tue.nl (Koen Holtman)
- Cc: http-caching@pa.dec.com (http-caching mailing list)
> > Ari Luotonen: > > > >A feature-request I have bumped into several times just recently, and > >towards which I'm tempted to incline, is kind of what AFS does: > > > > Have the server (as a server option) choose to tell > > the proxy that it is ok to return directly from the > > cache without a check for so-and-so long time. If > > during that time the object changes, the *server* will > > notify the *proxy* about this. > > Having explicit revocation is a good thing, but I agree with others > that it is too complex/unexplored to get it into HTTP 1.1 (which is > supposed to be a 'fast track' standard). > > [...] > >The theory behind this is that _most_ of the time _most_ objects do > >_not_ change soon after they get retrieved -- that's why today's > >proxies perform so well already (Netscape's proxy saves up to 60% in > >connections and 75% in bandwidth) when properly configured and with > >the critical mass of users using it), even though they rely heavily on > >heuristics, and there's minimal support for them in the protocol. > > 60% savings in connections and 75% savings in bandwidth???? I am a > bit disturbed by that figure, it is highly atypical. > > For 'local client -> non-local server' requests, proxies that I know > of do not perform that well already, and I doubt they ever > will. (Unless you are talking about proxies with gigabytes of > diskspace that serve at least a small country). > > The paper `Caching Proxies: Limitations and Potentials' at > http://www.w3.org/pub/Conferences/WWW4/Papers/155/ concludes: > > |1. We confirmed previous data in the literature that a caching proxy > | has an upper bound of 30-50% in its hit rate, given an infinite > | size cache and an eight day cache purge interval. > > I have measured a fairly constant 30-40% hit rate and 30-40% bandwidth > saving at our proxy cache. > > Are you sure that your figures are for a cache that caches _outgoing_ > http requests, i.e. request made by your local users to WWW servers > not on your local network? > > I have seen higher savings figures reported > - for caches that also cache traffic from local browsers to local > servers > - for caches that cache requests from outside users to local servers > (see for example > http://www.vuw.ac.nz/~mimi/www/www-caching/caching.html ) > but these figures do not matter much if you want to talk about the > impact of caches on the size of (non-local) internet traffic. > > I'm sorry to be so negative, but I have serious doubts about caching > schemes in proxies being able to reduce internet traffic with more > than 50%. Some people have argued that the exponential growth of web > traffic makes is _necessary_ for caching proxies to reach hit rates of > at least 95%, but I see no way in which caching technology to provide > such an exponential improvement. > > Service authors are continually putting new content on the web. If > this continual addition of new content did not exist, gigabyte-sized > caches might get to 95% hit rates, but with new content always being > added (and accessed), we can never reach 95%. > > [Note that I did not say that proxies cannot reduce web traffic with > more than 50%: using a combination of caching and compression, 75% > could be reached.] > > >Or in other words, the fact is that _most_ of the If-modified-since > >checks performed by proxies in fact yield 304. We're talking about > >over 90%; if configured to perform up-to-date checks for every > >request, that figure comes pretty darn close to 99.9%. > > I see no obligation in the protocol to perform up-to-date checks for > every request, so a configuration that gets 99.9% is completelty > unnecessary. Conditional GETs are only required for resources that > have expired. In fact, I would consider doing up-to-date checks for > every single request, if not forced to do so by an Expires header, to > be extremely rude and wasteful of origin server resources. > > On a related note, I recently discovered that the Netscape client > cache, if configured to `verify document: every time', will indeed do > a conditional GET for every new request on a resource that lacks an > Expires header. Eek. I thought that `verify document' applied to > conditional GETs on expired documents only, so I had enabled this > option on my Netscape copy. > > I am a bit disturbed by Netscape having this cache configuration > option at all. If only 10% of Netscape users enable it, this will > they will cause an enormous increase in the number of conditional GETs > going over the net. > > >So hey -- up-to-date checks are wasteful, too, and in practice all the > >service providers and most companies that run a proxy configure it so > >that it does _not_ perform checks during a few hours after the last > >check. > > I hope you are talking about not always performing conditional GETs on > resources that are not expired here. It would be _very bad_ for a > cache not to relay conditional GETs on documents that are expired. > > Perhaps we should put a note in the protocol about what preferred > cache behavior is if an Expires header is _absent_. > > I conclude that we should first focus on reducing the number of > _unnecessary_ conditional GETs by giving some guidelines in the 1.1 > protocol. Then, we can talk about replacing some of the necessary > conditional GETs with an explicit revocation scheme in 1.2. > > >Ari Luotonen ari@netscape.com > > Koen. > Cheers, -- Ari Luotonen ari@netscape.com Netscape Communications Corp. http://home.netscape.com/people/ari/ 501 East Middlefield Road Mountain View, CA 94043, USA Netscape Server Development Team
Received on Friday, 29 December 1995 22:56:48 UTC