- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Thu, 28 Dec 95 13:26:06 PST
- To: "David W. Morris" <dwm@shell.portal.com>
- Cc: HTTP Caching Subgroup <http-caching@pa.dec.com>
I don't know that the protocol needs to care, but from my
experience with many forms of documents over the years I believe we
should acknowledge that the most important issue with respect to
'staleness' of a document is the impact on the receiver of
incorrect content.
As servers get more sophisticated in their document management
model, expiration will become a more meaningful concept. Some
documents must be current. For many documents, expiration is not a
hard date but rather a general notion something like we know the
minimum review cycle for a updated personnel standard is X days.
Hence, at any qiven point in time the smart server could report an
expiration of NOW+X days unless the document is marked as under
review.
Just to clarify things: we have been using the term "expiration"
to refer to two somewhat different things: the expiration of a
document (or other object), and the expiration of cached copies
of a document/object.
For example, a server may know for sure that a document expires on June
1, 1999. But it may want to limit the unvalidated lifetime of a cached
copy handed out at any given point before then to 12 hours, on the off
chance that the person who wrote that document accidentally included a
libelous comment and may want to withdraw it sooner. (I'm
anthropomorphizing "server" to include its hardware, its software,
and its meatware [human administrators].)
I've been thinking all along about the latter meaning (cached-copy
expiration), not the former. To me it makes sense that the
Expires: date handed out by a server should be the minimum of
the two kinds of "expiration", if both are specified. Document
expirations are likely to be fixed dates; cached-copy expirations
are likely to be offsets from the generation of a response.
Basically, from the protocol perspective, a well formed expiration
model should expect expiration to change without any other change
to the document.
True. This implies that an interaction with the server that results
in a cache update (even one as simple as marking the copy "still
valid") should return a new Expires: header, so that if the expiration
time has been revised to be earlier, this is seen by the cache.
Cached-copy expiration times are dynamic values, not static ones.
If we look beyond HTTP 1.1 into the future, we must recognize that
HTTP caching (client, proxy, mirror) is a form of pretty primative
distributed data base. There has been research and development for
years in that problem domain and not all distributed data base
models insist on exact copies. As I look forward, I would expect
that caching systems would notify the 'owner' of intent to cache.
In that world, expirations can be safely set for long intervals
because the 'owner' can notify caches of changes. THe cache can
then decide to simply purge the data, pre-fetch frequently
referenced data, etc.
I think you are touching on the problem of "revocation." This
seems to require that the origin-server is aware of all of the places
where a cached copy might exist. It's not sufficient for the
server to simply know about the last-hop proxy, since another cached
copy could also exist closer to the client, and the client might
have switched proxies by the time that revocation is needed.
And it also requires some sort of call-back mechanism, which in
turn may require algorithms for dealing with crash recovery and
transient network partitions. All of which makes it highly unlikely
that we could address these in the context of HTTP 1.1, I think.
-Jeff
Received on Thursday, 28 December 1995 21:32:40 UTC