Re: Explicit revocation from Ari Luotonen on 1995-12-29 (http-caching-historical@w3.org from December 1995)

From: Ari Luotonen <luotonen@netscape.com>
Date: Fri, 29 Dec 1995 14:43:43 -0800 (PST)
To: koen@win.tue.nl (Koen Holtman)
Cc: http-caching@pa.dec.com (http-caching mailing list)
Message-Id: <199512292243.OAA13719@urchin.netscape.com>
> 
> Ari Luotonen:
> >
> >A feature-request I have bumped into several times just recently, and
> >towards which I'm tempted to incline, is kind of what AFS does:
> >
> >        Have the server (as a server option) choose to tell
> >        the proxy that it is ok to return directly from the
> >        cache without a check for so-and-so long time.  If
> >        during that time the object changes, the *server* will
> >        notify the *proxy* about this.
> 
> Having explicit revocation is a good thing, but I agree with others
> that it is too complex/unexplored to get it into HTTP 1.1 (which is
> supposed to be a 'fast track' standard).
> 
> [...]
> >The theory behind this is that _most_ of the time _most_ objects do
> >_not_ change soon after they get retrieved -- that's why today's
> >proxies perform so well already (Netscape's proxy saves up to 60% in
> >connections and 75% in bandwidth) when properly configured and with
> >the critical mass of users using it), even though they rely heavily on
> >heuristics, and there's minimal support for them in the protocol.
> 
> 60% savings in connections and 75% savings in bandwidth????  I am a
> bit disturbed by that figure, it is highly atypical.
> 
> For 'local client -> non-local server' requests, proxies that I know
> of do not perform that well already, and I doubt they ever
> will. (Unless you are talking about proxies with gigabytes of
> diskspace that serve at least a small country).
> 
> The paper `Caching Proxies: Limitations and Potentials' at
> http://www.w3.org/pub/Conferences/WWW4/Papers/155/ concludes:
> 
> |1.  We confirmed previous data in the literature that a caching proxy
> |    has an upper bound of 30-50% in its hit rate, given an infinite
> |    size cache and an eight day cache purge interval.
> 
> I have measured a fairly constant 30-40% hit rate and 30-40% bandwidth
> saving at our proxy cache.
> 
> Are you sure that your figures are for a cache that caches _outgoing_
> http requests, i.e. request made by your local users to WWW servers
> not on your local network?
> 
> I have seen higher savings figures reported
>  - for caches that also cache traffic from local browsers to local
>    servers
>  - for caches that cache requests from outside users to local servers
>    (see for example
>    http://www.vuw.ac.nz/~mimi/www/www-caching/caching.html )
> but these figures do not matter much if you want to talk about the
> impact of caches on the size of (non-local) internet traffic.
> 
> I'm sorry to be so negative, but I have serious doubts about caching
> schemes in proxies being able to reduce internet traffic with more
> than 50%.  Some people have argued that the exponential growth of web
> traffic makes is _necessary_ for caching proxies to reach hit rates of
> at least 95%, but I see no way in which caching technology to provide
> such an exponential improvement.
> 
> Service authors are continually putting new content on the web.  If
> this continual addition of new content did not exist, gigabyte-sized
> caches might get to 95% hit rates, but with new content always being
> added (and accessed), we can never reach 95%.
> 
> [Note that I did not say that proxies cannot reduce web traffic with
> more than 50%: using a combination of caching and compression, 75%
> could be reached.]
> 
> >Or in other words, the fact is that _most_ of the If-modified-since
> >checks performed by proxies in fact yield 304.  We're talking about
> >over 90%; if configured to perform up-to-date checks for every
> >request, that figure comes pretty darn close to 99.9%.
> 
> I see no obligation in the protocol to perform up-to-date checks for
> every request, so a configuration that gets 99.9% is completelty
> unnecessary.  Conditional GETs are only required for resources that
> have expired.  In fact, I would consider doing up-to-date checks for
> every single request, if not forced to do so by an Expires header, to
> be extremely rude and wasteful of origin server resources.
> 
> On a related note, I recently discovered that the Netscape client
> cache, if configured to `verify document: every time', will indeed do
> a conditional GET for every new request on a resource that lacks an
> Expires header.  Eek.  I thought that `verify document' applied to
> conditional GETs on expired documents only, so I had enabled this
> option on my Netscape copy.
> 
> I am a bit disturbed by Netscape having this cache configuration
> option at all.  If only 10% of Netscape users enable it, this will
> they will cause an enormous increase in the number of conditional GETs
> going over the net.
> 
> >So hey -- up-to-date checks are wasteful, too, and in practice all the
> >service providers and most companies that run a proxy configure it so
> >that it does _not_ perform checks during a few hours after the last
> >check.
> 
> I hope you are talking about not always performing conditional GETs on
> resources that are not expired here.  It would be _very bad_ for a
> cache not to relay conditional GETs on documents that are expired.
> 
> Perhaps we should put a note in the protocol about what preferred
> cache behavior is if an Expires header is _absent_.
> 
> I conclude that we should first focus on reducing the number of
> _unnecessary_ conditional GETs by giving some guidelines in the 1.1
> protocol.  Then, we can talk about replacing some of the necessary
> conditional GETs with an explicit revocation scheme in 1.2.
> 
> >Ari Luotonen                            ari@netscape.com
> 
> Koen.
> 

Cheers,
--
Ari Luotonen				ari@netscape.com
Netscape Communications Corp.		http://home.netscape.com/people/ari/
501 East Middlefield Road
Mountain View, CA 94043, USA		Netscape Server Development Team
Received on Friday, 29 December 1995 22:56:48 UTC