Re: Issues-list item "CACHING-CGI" from Roy T. Fielding on 1997-08-09 (ietf-http-wg@w3.org from July to September 1997)

From: Roy T. Fielding <fielding@kiwi.ics.uci.edu>
Date: Sat, 09 Aug 1997 02:07:57 -0700
To: "David W. Morris" <dwm@xpasc.com>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9708090227.aa09342@paris.ics.uci.edu>

>> >	A cache MUST NOT assign a heuristic expiration time to a
>> >	response for a URL that includes the strings "htbin", "cgi-bin", or
>> >	"?" in its rel_path part.  If such a response does not 
>> >	carry an explicit expiration time, it must be treated as
>> >	if it expires immediately.
>> 
>> I'm pretty sure I said this before, but I don't know what list.
>> I am completely opposed to this change.  It is inaccurate to say that
>> caching and reusing such responses is "dangerous".  The *only* reason
>> *some* caches do not provide heuristic caching of such responses is
>> because the presence of query-based parameters make it unlikely to get
>> a second "hit" on the cache, and because the the absence of a Last-Modified
>> (and now Etag) makes it impossible to do an efficient update.  In any case,
>> this is an optimization which is dependent on the context and number of
>> users of the cache, and not a requirement of the protocol.
>> 
>> The protocol already provides mechanisms for marking a response as
>> non-cachable.  All other responses to a GET request are cachable.
>
>I can't speak for the motivation of old cache authors, but I can speak as
>an HTTP/1.0 application author from before any RFCs when one had to
>reverse engineer everything and the empirical behavior I observed was that
>GET requests which included a query part were not cached.

In most cases, yes, but the relevant question is why they are not cached.
The reason given by all of the cache maintainers I talked to at the
Geneva, Chicago, Boston, Paris, and Santa Clara WWW conferences is that
attempts to cache those responses led to poor hit frequency on the cache.
In other words, it is a performance optimization and does not need to
be written into the protocol.

>I support the behavior for handling HTTP/1.1 responses strictly 
>conforming to Roy's position but I believe somthing like Jeff's
>proposed wording is necessary when the 1.1 cache is covering a 1.0
>server.

Why?  Why is it "necessary" to force a cachable response to be non-cachable?
HTTP/1.0 had its own mechanisms for indicating cachability, and the presence
or absence of "?" or "cgi-bin" or "htbin" does not indicate anything.
It is a fundamentally bad idea to associate any URL pattern with a particular
HTTP functionality, because it breaks the separation between those two
protocols.

....Roy

Received on Saturday, 9 August 1997 02:32:04 UTC