Re: Location Proposals from Shel Kaphan on 1995-08-31 (ietf-http-wg@w3.org from July to September 1995)

From: Shel Kaphan <sjk@amazon.com>
Date: Thu, 31 Aug 1995 16:21:02 -0700
To: Koen Holtman <koen@win.tue.nl>
Cc: Shel Kaphan <sjk@amazon.com>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <199508312321.QAA00042@bert.amazon.com>
Koen Holtman writes:
 > Shel Kaphan:
	...
 > >The case of the cache that goes down for a while, and comes up holding
 > >now-invalidated copies of things without knowing it, seems to apply
 > >more generally than to just this case, however.
 > 
 > Under the current spec, going down and coming up without clearing the
 > cache database is safe (assuming a reasonable implementation that
 > keeps absolute time stamps in the database, and always checks these
 > time stamps before serving a response from cache).
 > 

Here's the weak spot, which is somewhat short of an error condition:
A cache holds a document that doesn't expire any time soon.
At time T1 - 1 hour, the cache goes down for 2 hours.
At time T1, a client requests the document, and receives
it from the origin server.  Say the document was updated since the last
copy was put in the cache, even though the expiration date had not
been reached. The cache, being down, fails to notice a newer copy.
At time T1 + 1 hour, the client re-requests the document, the cache is
now up, and so the cache delivers the older copy.  Now, since the
cached document had not *expired*, you could say this is fine, and I don't
have any idea how this could be avoided anyhow.  But the user could easily
notice that they were now seeing an older copy of a document they once
had a newer copy of. 

 > >Now you've got me worried.  The example you gave requires that your
 > >"basket" page never be cached, essentially because it is accessed
 > >under different URIs for different request methods,
 > 
 > No.  Essentially, it cannot be cached because it is dynamic, because
 > it may change 1 second from now.
 > 

Well, you seem to be thinking that documents that change often must be set to
be uncacheable.  That works, but it would just be nicer if documents
that change only in response to user actions could be cached until the next
user action that changes them.
	...
 > > and caches in the
 > >world can't be assumed to be continuously up, robust, and correct.
 > 
 > Caches can be assumed to be robust and correct, even if they go down
 > sometimes.  My point was that your `should replace' requirement would
 > require (correct) caches to mark all unexpired entries as expired if
 > they come up again after having been down.  This is 1) wasteful and 2)
 > requires all current cache implementations to be upgraded.
 > 

Yes, excellent point.  I had not considered this set of issues.


 > >This then seems to imply a general unpleasant side effect of using
 > >Location URI != request URI.
 > 
 > No.  Under my scheme, Location URI != request URI does not introduce
 > robustness problems, for non-expired and expired entries alike.
 > 

Consider an example using non-expired, cacheable documents.  If a
cache already has something under location L1, and then receives a
response to request-URI L2 which responds with Location header L1, how
does the cache respond?  This seems to be an open question.  Due to
the problems you mentioned, it is clear that the cache can't be relied
on to "forget" the old L1. (the cache might be down, it might not
successfully parse the headers, it might find an unrecognized
header...)  So, on the next GET L1, the cache may return the cached
copy, which, though not expired, may predate the copy the user has
already received.  Same problem as above.

	...

 > You are still thinking in terms of a mechanism that makes caches
 > replace previously cached, but unexpired, copies.

They sure should do so!  If a cache gets a newer copy of a document,
it should lose the older one, even if its not expired.  Is this
controversial?  Expiration dates are a very weird concept anyway.
We're mostly using them to signal that something is "pre-expired" and
so uncacheable, something best done with some other mechanism anyway.
Expiration dates in the future are usually guesses, and often
overridden by user actions.

  The caching
 > scenario in my previous message assumed that no such mechanism was
 > present.
 > 
But in your example, the non-cacheability of the "basket" made such
a mechanism irrelevant.

 > I guess we need a term for the practice of keeping an expired response
 > in cache memory to facilitate future conditional GET gets.  What about
 > `conditionally cached'?
 > 

Not sure what this actually buys.  A cache can never know whether a
document has been more recently modified than the last time it fetched it.
So it has to check every time.

 > Koen.

--Shel
Received on Thursday, 31 August 1995 16:26:28 UTC