Still trying to make sense of HTTP caching model

Roy is probably working on something like this for the 1.1 draft, but
I figured I would give it a stab anyway, since there seems to be
some confusion.

I think we need to make it clear that there are two quite separate
issues in HTTP caching: (A) what gets put into the cache, and (B) when
may the cache be used to satisfy a request.  It is a mistake to try to
solve these two issues with a single mechanism.

(A) What gets put into the cache:

The server decides whether an object can be put into the
cache, and with what lifetime.  (The cache manager can
further restrict the server's decisions, but should never
be more permissive.)

There are three things a server could tell the cache:
   1.	Do not cache this object
   2.	Cache this object but always validate before using
	   (i.e., "Expires: 1 January 1900")
   3.	Cache and use without validation until Expiration
	   after which time, cache but validate
A server could explictly set the Expiration date to "way in the
future", which might be a bad idea but the cache manager can always
set a shorter expiration.  (Caches are like that!)  If a server
fails to send an expiration date, current practice suggests that
this be treated as case #3 (or else most caches would become useless),
but I believe that the protocol specification should recommend that
a cache set a relatively short implicit Expiration time (say, 1 day
in the future).

The protocol spec might also need to state which HTTP responses
(or method/response pairs) can cause things to be entered or removed
from a cache.  That is, it might make sense for certain responses
to carry Expiration and validation information but not for them
to create new cache entries; I haven't thought this through.

(B) When may the cache be used to satisfy a request:

Presumably, this is simply a list of methods that can be satisfied
from a cache, if the cache entry is valid (or non-expired).  Since
we may expect new methods to be added to HTTP, this list should include
a rationale for why certain methods are included or excluded.

This may also include a description of a way for the cache to return
a response, appropriately labelled as "suspicious", if the rules
require it to validate a cache entry but the server does not respond.

Note that this model does not mention time in the discussion of
how to validate a cache entry that needs validation; time is only
of interest for deciding on Expiration.  "If-modified-since" is evil.

I'm also avoiding the use of the word "idempotent" or "cachable"
when discussing *requests*; responses may be "cachable", not requests.
The concept of idempotency does not seem to be either applicable or
necessary, if the right distinctions are made.  (Note that getting
true idempotency is tricky; races between multiple clients can make
it almost impossible.)

-Jeff

P.S.: By the way, "caching" is spelled thus in American English, as is
"cachable." I went through this fight with my thesis advisor a decade
ago, and found American dictionaries to support "caching" but not
"cacheing." However, the OED apparently insists on "routeing" rather
than "routing", but I have not gotten around to checking my OED on
"caching".

Received on Tuesday, 5 September 1995 16:47:52 UTC