- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Thu, 28 Dec 95 13:26:06 PST
- To: "David W. Morris" <dwm@shell.portal.com>
- Cc: HTTP Caching Subgroup <http-caching@pa.dec.com>
I don't know that the protocol needs to care, but from my experience with many forms of documents over the years I believe we should acknowledge that the most important issue with respect to 'staleness' of a document is the impact on the receiver of incorrect content. As servers get more sophisticated in their document management model, expiration will become a more meaningful concept. Some documents must be current. For many documents, expiration is not a hard date but rather a general notion something like we know the minimum review cycle for a updated personnel standard is X days. Hence, at any qiven point in time the smart server could report an expiration of NOW+X days unless the document is marked as under review. Just to clarify things: we have been using the term "expiration" to refer to two somewhat different things: the expiration of a document (or other object), and the expiration of cached copies of a document/object. For example, a server may know for sure that a document expires on June 1, 1999. But it may want to limit the unvalidated lifetime of a cached copy handed out at any given point before then to 12 hours, on the off chance that the person who wrote that document accidentally included a libelous comment and may want to withdraw it sooner. (I'm anthropomorphizing "server" to include its hardware, its software, and its meatware [human administrators].) I've been thinking all along about the latter meaning (cached-copy expiration), not the former. To me it makes sense that the Expires: date handed out by a server should be the minimum of the two kinds of "expiration", if both are specified. Document expirations are likely to be fixed dates; cached-copy expirations are likely to be offsets from the generation of a response. Basically, from the protocol perspective, a well formed expiration model should expect expiration to change without any other change to the document. True. This implies that an interaction with the server that results in a cache update (even one as simple as marking the copy "still valid") should return a new Expires: header, so that if the expiration time has been revised to be earlier, this is seen by the cache. Cached-copy expiration times are dynamic values, not static ones. If we look beyond HTTP 1.1 into the future, we must recognize that HTTP caching (client, proxy, mirror) is a form of pretty primative distributed data base. There has been research and development for years in that problem domain and not all distributed data base models insist on exact copies. As I look forward, I would expect that caching systems would notify the 'owner' of intent to cache. In that world, expirations can be safely set for long intervals because the 'owner' can notify caches of changes. THe cache can then decide to simply purge the data, pre-fetch frequently referenced data, etc. I think you are touching on the problem of "revocation." This seems to require that the origin-server is aware of all of the places where a cached copy might exist. It's not sufficient for the server to simply know about the last-hop proxy, since another cached copy could also exist closer to the client, and the client might have switched proxies by the time that revocation is needed. And it also requires some sort of call-back mechanism, which in turn may require algorithms for dealing with crash recovery and transient network partitions. All of which makes it highly unlikely that we could address these in the context of HTTP 1.1, I think. -Jeff
Received on Thursday, 28 December 1995 21:32:40 UTC