Re: Issues-list item "CACHING-CGI" from David W. Morris on 1997-07-30 (ietf-http-wg@w3.org from July to September 1997)

From: David W. Morris <dwm@xpasc.com>
Date: Tue, 29 Jul 1997 23:23:12 -0700 (PDT)
To: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Cc: Jim Gettys <jg@pa.dec.com>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <Pine.GSO.3.96.970729231830.11299E-100000@shell1.aimnet.com>

On Tue, 29 Jul 1997, Roy T. Fielding wrote:

> >Here's a revised version, to replace the second paragraph
> >in section 13.9:
> >
> >	Some HTTP/1.0 cache operators have found that it is dangerous
> >	to cache and reuse without revalidation responses to requests
> >	for URLs that include any of the strings "cgi-bin", "htbin", or
> >	"?", because applications have traditionally used these URLs in
> >	conjunction with operations with significant side effects for
> >	GET or HEAD methods.  However, if such a response includes an
> >	explicit, future, expiration time, then this implies that the
> >	response may be cached and reused without revalidation until it
> >	expires.  If such a response includes a Last-Modified or Etag
> >	header, this implies that the response may be reused after
> >	revalidation (or without revalidation if explicitly fresh).
> >
> >	A cache MUST NOT assign a heuristic expiration time to a
> >	response for a URL that includes the strings "htbin", "cgi-bin", or
> >	"?" in its rel_path part.  If such a response does not 
> >	carry an explicit expiration time, it must be treated as
> >	if it expires immediately.
> 
> I'm pretty sure I said this before, but I don't know what list.
> I am completely opposed to this change.  It is inaccurate to say that
> caching and reusing such responses is "dangerous".  The *only* reason
> *some* caches do not provide heuristic caching of such responses is
> because the presence of query-based parameters make it unlikely to get
> a second "hit" on the cache, and because the the absence of a Last-Modified
> (and now Etag) makes it impossible to do an efficient update.  In any case,
> this is an optimization which is dependent on the context and number of
> users of the cache, and not a requirement of the protocol.
> 
> The protocol already provides mechanisms for marking a response as
> non-cachable.  All other responses to a GET request are cachable.

I can't speak for the motivation of old cache authors, but I can speak as
an HTTP/1.0 application author from before any RFCs when one had to
reverse engineer everything and the empirical behavior I observed was that
GET requests which included a query part were not cached.

I support the behavior for handling HTTP/1.1 responses strictly 
conforming to Roy's position but I believe somthing like Jeff's
proposed wording is necessary when the 1.1 cache is covering a 1.0
server.

Dave Morris

Received on Tuesday, 29 July 1997 23:28:44 UTC