Re: HTTP/1.1 from Jeffrey Mogul on 1996-09-20 (ietf-http-wg@w3.org from July to September 1996)

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Fri, 20 Sep 96 10:30:56 MDT
To: Peter Ball <pball@hab-software.de>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9609201730.AA22153@acetes.pa.dec.com>

    I recently read the (August 14) draft on HTTP/1.1 and I thought that I
    would give my comments:

These comments are somewhat late, since we have already spent 6 months
debating these points.  I'll try to explain why things are the way they
are.

    1. The second paragraph of 13.9 is unclear in it's meaning. Do you mean
    that a URI with a ? should never be cached or only when from an HTTP/1.0
    server? Under HTTP/1.1 I think it should be cached (and under 1.0 it
    should have been too but of course that cant be changed anymore) because
    it's up to the programmer of the CGI script to provide cache directives
    preventing caching if the calling of the script produces side effects. I
    mean, I am currently writing some CGI programs that always return the
    same thing because the QUERY_STRING parameters selects info from a
    database and such a response is cachable. Because it is by default not
    cached then I need to give an Expires header to enable caching but I
    think in HTTP/1.1 this shouldn't be necessary.
    
In general, it is not possible for a cache to be sure that a response
was generated by an HTTP/1.1 server.  Therefore, it is not always safe for
a cache to retain a response to a GET on a "?" URL, unless the origin
server has provided explicit notification that such a response is
cachable (by providing an explicit expiration time).  If you can't
provide an expiration time, then probably your response shouldn't
be cached anyway.

We rejected the approach of allowing caching of "?" responses unless
specifically prevented, because in the ambiguous cases the use of
caching is likely to lead to wrong results.  Numerous people reported
operational problems with HTTP/1.0 caches for precisely this reason.

    2. This is really only a thought that occured to me when I read the
    draft. It seems that the cache control is rather complex. Could it be
    possible that the extra burden of all this cache management could make
    proxies ineffective? Even now, my current experience with proxies is
    that they seem to be overloaded most of the time and I generally get a
    better and more reliable connection when I remove them from my list of
    proxies in my browser. Could the more complex cache control further
    burden the proxy computers?

Two points:
	(1) The run-time cost of implementing the Cache-control
	directives is likely to be minimal compared to the cost
	of storage management in a proxy cache.
	(2) The lack of the ability to carefully manage caching
	has discouraged the use of caches, and so has prevented
	us from removing load from the Internet.

Many caches are not optimally implemented.  There have been a number of
recent research papers published on this topic.  If your caches
are working badly, the likely reason is not the complexity of the
HTTP protocol.
    
-Jeff

Received on Friday, 20 September 1996 10:47:06 UTC