Re: What can you cache? [was: Byte ranges -- formal spec proposal ]

In message <199505182135.OAA10111@neon.netscape.com>, Ari Luotonen writes:
>
>> Why not cache CGI-bin responses?
>
>Most of the time CGI responses are entirely dynamic,

What do you mean by "entirely dynamic"? The Last-Modified: of the
output is the latest last-modified of the inputs, plus the
last-modified of all the software that goes into computing the
response. The Makefile model comes to mind.

If you've got some CGI-bin script that returns the current
temperature, then yes: its output is instantatneously out of date.
But if you're querying a big index, you can cache results until you
rebuild the index, no?

> and not only
>return a different document each time they're called, but also have
>intended side effects.

Bzzzt. GET requests are specified to be idempotent. No visible
side-effects allowed.

PUT requests must never be cached, as they may have such side-effects.

>  In the earliest stages the CERN server used to
>cache also CGI responses, but that was a mistake and I changed it so
>that only documents with either Expires: and/or Last-modified: header
>can be cached.  This is the way both CERN and Netscape proxies do it,
>and anything smarter will cause problems.

Anything smarter _may_ cause problems (ok... murphy's law and
all...). So what you've done is safe.

>  So, CGI scripts may
>explicitly allow caching by giving at least on of those headers,
>indicating a non-zero lifetime (L-M less than current time; Expires
>greater than current time).

Exactly. They could also do If-Modified-Since calculations and return
"304 not modified" in case their inputs haven't changed.


>> I believe some proxies (and some clients) cache more aggressively than
>> this. For example, I heard that the hensa cache doesn't bother with
>> the If-Modified-Since request unless the cache entry is 12 hours or
>> 10% of the lifetime of the document (current time - last-modified).
>
>This is not entirely correct.  HENSA doesn't cache CGI responses
>unless they have the Exp/L-M header, so this 12hr/10% rule applies to
>only static documents.

Glad to hear that.

>  In general, if you want to get massive savings
>from running the proxy, both in response time and bandwidth, you have
>to use such settings.  In general, min{12hr,10%} is a very safe
>setting.

"Very safe" is still not 100% reliable. It's heuristic. That's why I'd
like clients to be able to say "It's OK with me if you return
documents that are up to 12 hours out of date. I'll trade authenticity
for latency." Just a generalization of the Proxy: no-cache mechanism.

Dan

Received on Thursday, 18 May 1995 15:00:41 UTC