- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Mon, 1 Dec 2008 17:37:50 -0800
- To: Mark Nottingham <mnot@mnot.net>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
On Nov 28, 2008, at 2:41 PM, Mark Nottingham wrote: > When the cache key discussion came up, it became clear that we > needed to do some digging into the history of HTTP caching, which > means looking at the mailing list of the original HTTPWG's caching > sub-group. Unfortunately, I couldn't locate any online archives > remaining, but Martin Hamilton kindly provided an mbox, which has > been reconstructed at: > > http://lists.w3.org/Archives/Public/http-caching-historical/ > > In looking through that, it's clear that there was discussion of > POST caching, etc. early on; > http://lists.w3.org/Archives/Public/http-caching-historical/ > 1996Jan/0025.html > http://lists.w3.org/Archives/Public/http-caching-historical/ > 1996Jan/0026.html > http://lists.w3.org/Archives/Public/http-caching-historical/ > 1996Jan/0028.html > http://lists.w3.org/Archives/Public/http-caching-historical/ > 1996Jan/0030.html > http://lists.w3.org/Archives/Public/http-caching-historical/ > 1996Jan/0075.html > > (I believe this is before the difference between Location and > Content-Location was specified, which is why Location is mentioned). > > But, no consensus was reached, as reflected by the state of the > "updated issues list" (under "not agreed"); > http://lists.w3.org/Archives/Public/http-caching-historical/ > 1996Feb/0114.html > > It did come up at a F2F, but was not "fully" discussed, and several > aspects were deferred; > http://lists.w3.org/Archives/Public/http-caching-historical/ > 1996Feb/0039.html I addressed the relevant parts of that meeting (which I was not able to attend in person) in this post: <http://lists.w3.org/Archives/Public/http-caching-historical/1996Feb/ 0095.html> The question boils down to the three cache models under Extensibility: > Larry described possible three ways to view an HTTP cache: > > a) a cache stores values and performs operations on these > values based on the requests and responses it sees. For > the purposes of the cache, one can describe each HTTP > method as a transformation on the values of one or more > resources. > > b) a cache stores responses, period. > > c) a cache stores the responses to specific requests. > The cache must be cognizant of the potential interactions > between various requests; for example, a PUT on a resource > should somehow invalidate the cached result of a previous > GET on the same resources, but a POST on that resource > might not invalidate the result of the GET. The HTTP/1.1 proposal that Henrik and I developed was based on (c). HTTP is supposed to be more extensible than a storage interface. Our design decision was to make the messages self-descriptive rather than assume a prescriptive data model, thereby allowing efficient cache operation via message description on arbitrary methods. It was a known trade-off versus the more traditional caching models of distributed file systems that could benefit from write-back caching by limiting the set and scope of resource-modifying operations to a shared data model. Rough consensus in both the WG and implementations was on (c), but that was not entirely reflected in the caching section that was added to the pre-2068 spec during the final revs. The caching section left it out. The rest of the HTTP spec is based on (c). The visible difference between (a) and (c) is how cacheable responses to non-GET requests are enabled, which is defined in model (c) by the method semantics, response status code, and the response field-values for Cache-Control and Content-Location. It was not successfully defined by model (a). In other words, an HTTP cache must consider the method as part of the cache key if it allows caching of anything other than GET/HEAD responses. An HTTP cache cannot do write-back operations. A response to a non-GET/HEAD request is cacheable if it says so in cache-control *and* the cache understands how to construct the cache key for that method (this is presumed to be defined by the method semantics). Any response that contains a Content-Location is cacheable as if it were a 200 response to GET if it can be trusted to be from the same authority as that location value. It follows, therefore, that a response to POST that includes both a cacheable Cache-Control and a Content-Location matching the POST request target is equivalent to saying that the enclosed entity contains what would be in the response to a GET on that same URI immediately after the POST completed. The HTTP/1.1 proposal was not designed to behave like a storage interface, so it's no surprise that it doesn't look like a CPU cache or even a disk cache. Jeff tried to address that issue in his summary of the cache models. I think that the subgroup discussion showed that model (a) did not fit the needs of HTTP. The subgroup's operating procedures at the time were that the existing HTTP/1.1 design would not be changed unless there was rough consensus for the change. ....Roy
Received on Tuesday, 2 December 2008 01:38:30 UTC