- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Wed, 03 Jan 96 19:37:04 PST
- To: http-caching@pa.dec.com
As Shel points out, There's an antique version of form submission that uses GET with stuff crammed into the URL, that *can* cause server side effects. It's just a convention that GET doesn't cause side effects. In Roy's draft, section 14.2 says that GET and HEAD should not have side effects. It also says that the protocol cannot enforce this, but (implicitly) that implementations of the protocol may assume that GET and HEAD are in fact side-effect-free. I believe that this is a valuable concept, and I don't think we should give up our ability to make this assumption. As for other methods, I suggest that the notion of "side effects" is the wrong one. What we may want to specify is whether certain methods are "idempotent". Distributed systems types use the word "idempotent" to mean "repeating the operation N times has the same side effects as repeating it 1 time, for N > 1". For example, POST is probably not idempotent, but PUT might be idempotent if properly serialized (which we may not be able to do, of course). Other methods with side effects include PATCH, COPY, MOVE, DELETE, LINK, and UNLINK. Some of these are clearly not idempotent; others might be. I won't consider these for now, however. [Begin: thinking-out-loud portion of message] How might one make PUT idempotent? Well, suppose that the client starts by doing a GET on an existing resource, and the server returns a cache-validator value for the resource. Then the client issues one or more PUTs on the resource, handing back the cache-validator value it received from the server. If the server only performs the PUT when the client's validator matches the one that it would provide if a GET were done on the resource, *and* (very important) the validator is constructed in a way that is guaranteed to change when the resource is modified, then we have an idempotent method. No matter how many intervening PUTs have been done, this should not result in the same PUT being done twice. In other words, we have a "conditional PUT". But unlike a conditional GET, which performs the full operation only if the validators do not match, the conditional PUT performs the full operation only if they *do* match. If the resource did not exist before the PUT, the client could supply a special "null" validator which is guaranteed not to match anything. The server would allow this kind of conditional PUT only if the resource doesn't already exist. Why is this important? Suppose that the client performs a PUT via a proxy, and the server updates the resource and returns a 200 OK to the proxy, and closes the connection. But the TCP connection between the proxy and the client fails before the client receives the 200 OK. If the client simply retries the PUT, it may overwrite an intervening PUT done by another client. But if this "conditional PUT" approach is taken, then the retry cannot cause this erroneous result. It could return OK or it could return an error status, and it might be hard for the client to figure out whether the error status came because the first PUT had succeeded or because some other client had updated the resource first. But I suppose the client could simply do another GET to see if the right PUT had been done. But wait: couldn't one do conditional POSTs the same way, using the cache-validator of the original (un-posted-to) resource? This would protect against races from other clients, for example. What does this have to do with caching the results from POSTs and PUTs? I'm not sure. Perhaps nothing. But it might be worth trying to think through a way for the server to tell a cache that the entity-body supplied with a PUT request, possibly taken with some headers returned by the server, can be treated as a cached copy of the PUTted resource (because PUT replaces the resource, rather doing any partial modification). This would avoid a subsequent reload from the server on a subsequent GET of that resource. -Jeff
Received on Thursday, 4 January 1996 03:42:47 UTC