- From: Shel Kaphan <sjk@amazon.com>
- Date: Sun, 7 Jan 1996 11:06:55 -0800
- To: koen@win.tue.nl
- Cc: dwm@shell.portal.com, http-caching@pa.dec.com
Koen says: ... I agree with Dave that the method must be part of the cache key. Well, if that's generally what people think, then we can dispense with the no-side-effects thing, which controls how *different* methods use the same cache entry. But I'm pretty surprised if it is what people think, since it seems so fundamental to a correct cache design to me. On the other hand, the no-side-effects business may be confusing enough that people would do just about anything to avoid it. >I do that all the time (use the same URI for POSTs and GETs), just to >get around cache problems when different kinds of requests need to >return new versions of the same object. I also do it all the time, mainly to allow reload buttons on some browsers to work as expected. But my POSTs will not necessarily yield the same result as subsequent GETs on the URI, they can also give an error message in a 200 response, leaving the content bound to the GET unchanged. That's what Location at least *could* be used for. > Saves a redirection, which in >today's world can't be trusted to contact the origin server on the >second request anyway. Redirection also cannot be trusted? Ack! I wonder what you can trust these days. Can you give an example of a client or proxy that caches redirects? I can't remember which ones I had trouble with, but I remember having to abandon a redirection-based way to control this based on some system or other illicitly caching the results of the redirection target URI. >I certainly want to have the ability to follow a POST with a later GET and >get what the POST returned. You have it now, by making the GET response always expire immediately. Modulo browsers that don't care about Expires, of course. Yes. > In addition, I want the response from that >POST to make it impossible for me to receive the previous version of >that object that may have been already in the cache, when I do a later GET. That is a valid thing to want, but you cannot get it by throwing the request method out of the cache key. Too many things would break. >To me, insisting that each method have its own cache slot for a given >URI would be analogous to designing a computer cache where LOADs and >STOREs didn't share cache slots for the same memory locations. Many POSTs do not act as STOREs. No, they're more like ADD-TO-MEMORY, to use the same metaphor, but what about PUTs? Are you saying that when PUT becomes more popular, that caches should separately cache PUTs from GETs on the same URI??? The 1.1 draft already provides `see other redirection' for POSTSs that do act as a STOREs. If implementations of this are broken, they need to be fixed. Speccing an new alternative scheme, and hoping that implementations of that new scheme will be less broken, holds little promise as a fix. Well, though I like being able to control the method on a redirection, I have never much liked the requirement for a second round trip to do this. Also, I don't believe the spec says anything about being able to control "forced reloading" on the redirection request, which I would view as a requirement. (Does it? I don't have it in front of me). Instead, I think all involved objects must be marked as never fresh. If you want to propose some alternative scheme to `see other redirection', the only possible justification can be that this alternative scheme is more efficient, for example because it avoids the conditional GETs on every request that are needed in the `see other' method. It is one RTT more efficient, and doesn't require that objects that may be involved in this always be marked as "not fresh". But yes, redirection can accomplish the same outward results. [From here on, I am speculating on how to improve on `see other'] Improving on `see other redirection' can be tricky. A scheme that lets POST responses influence cached responses of earlier GETs can only work as long as all GETs and POSTs travel through the same cache. This problem affects anything we say about cache coherency, in the absence of a revocation protocol. If the user agent sends POSTs directly to the origin server, and GETs though a proxy cache, then the proxy cache has no chance of invalidating the GET response. And didn't AOL use a scheme in which their browsers randomly access one of several proxies for subsequent requests? I believe so. This may contribute to why AOL is among the more difficult systems to make an interactive WWW service work through. [ purely speculative flaming here, but... ]: I would claim that if someone is going to run non-communicating caches in a round-robin fashion like this, then in order to really be correct, the caches themselves should be using different algorithms that would inevitably make their caching less effective. Or perhaps, that there should be some way for such caches to communicate their nature to origin servers so that the servers could be more conservative about cachability of responses. The only thing we can really require as far as request routing is concerned, is that if a 1.1 browser has an internal cache, then all GETs and POSTs must go through that cache. So I get to the following design: Well, the design point I have been using is that since we have no revocation protocol, the only thing we can control is the behavior of an individual cache. I have been assuming that most of the time, the arrangement of caches between a client and a given server will be fairly constant. If not, we can't control what happens. But we can do something about consistency and behavior of an individual cache, and so we probably should. Someday maybe there will be a revocation protocol, and then it would look bad if single caches couldn't even maintain coherency. There must be some way to say Cache-control: max-age-for-browser-caches=X, proxy-caches-must-always-do-conditional-GETs in 1.1 responses. (We can already say something slightly less efficient: Cache-control: max-age=X, private.) A server wanting to use the `POST response replaces old GET responses that are not stale yet' mechanism on an URI U must send this Cache-control information in every GET response on U. Further, we require from browsers that If a 1.1 internal browser cache has stored a GET response GR on URI U, and it relays a POST response PR from URI U containing the response header Location: U , then the cache must either invalidate the old GET response GR or (highly preferred) replace it with the POST response PR. Shel, would this be acceptable to you? I'd rather convince you of the design point above than start getting into browser/proxy differences yet, which I don't like too much. I for one would like to have this kind of behavior, it would allow my own web software to be a bit more cache-friendly. Spoofing problems would be virtually absent in the above scheme. But I wonder if this scheme isn't too complicated. If we spec all this, what are the chances that everybody will get it right? pretty small. >--Shel Koen. --Shel
Received on Sunday, 7 January 1996 19:35:06 UTC