Re: Location Proposals from Paul Leach on 1995-09-07 (ietf-http-wg@w3.org from July to September 1995)

From: Paul Leach <paulle@microsoft.com>
Date: Wed, 6 Sep 95 19:09:32 PDT
To: john@math.nwu.edu
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9509070304.AA08838@netmail2.microsoft.com>
John says:
] According to Paul Leach:
] >
] > Tagging a document with a very ancient "Expired:", when handled
] > consistent with the above considerations, will almost completely
] > automatically cause the behavior that has been asked for: you always
] > get the latest copy, saving the cost of getting it from the server when
] > the cached copy is still valid, even though there is no promise that it
] > still is.  The only slightly "funny" thing is that the agent has to
] > ignore the Expires when making replacement decisions, instead only
] > considering the actual reference pattern and other info, such as
] > Last-Modified; in the absence of knowledge of this usage style, one
] > might be tempted to toss resources out of the cache that had expired a
] > long time ago.
] >
]
] Am I the only one who finds it extremely counter-intuitive to "ignore
] the Expires when making replacement decisions" for a cache?  I
] strongly agree that "one might be tempted to toss resources out of the
] cache that had expired a long time ago" and I would suggest that
] adopting a semantics for the Expires header where the date has nothing
] to do with whether or not a document can be removed from the cache
] is inviting immense confusion.

The fact that a document has expired does not say ANYTHING about 
whether it is still valid -- it only says that the cache manager has to 
check that it still valid before handing it out.  What has expired is 
*not* the document, but the origin server's promise that is OK to serve 
it from the cache w/o contacting the origin server. (If you read this 
as that the document is expired, I can see why it is counter-intuitive...)

The expiration date also says nothing about whether it can be removed 
from the cache -- any document can be removed from the cache at any 
time, and correctness will not be affected (a subsequent GET will just 
have to go to the origin server to fetch it). The expiration date also 
says nothing about what things can be kept in the cache, only whether 
the a GET IMS needs to be done before using what's there.

Caches that toss out documents that just as soon as they have expired 
are likely to be very suboptimal, performance-wise. It would be much 
better to keep it as long as there is space, and then a subsequent GET 
might be able to be served without dragging the document from the 
origin server, if a GET IMS permits it.
]
]
] > To add some other header to try and create this behavior will only
] > result in there being two ways to do the same thing, one completely
] > natural, and the other (IMHO) forced. The result will be confusion.
] >
]
] I don't understand why there would be two ways to do the same thing.
] Caches would discard documents after they have expired.  That seems
] pretty natural to me.

As per Jeff's suggestion, I won't use the N-word any more.  But I hope 
the comments above say why I think it's not a good idea to discard 
documents that have expired.

]
] There is already a proposed Control-Cache: header.  Adding an
] additional possible value like "Control-Cache: use-get-if-modified" or
] some equivalent is extremely clear.  It says this information is for
] caches and it says what it wants the cache to do.  In contrast,
] Expires: 1900 or Expires: <yesterday> says this document is no longer
] valid.

At the risk of repeating: no it doesn't. It says it *may* be invalid.

Paul
Received on Wednesday, 6 September 1995 19:12:31 UTC