Re: Cache key history from Mark Nottingham on 2009-03-02 (ietf-http-wg@w3.org from January to March 2009)

From: Mark Nottingham <mnot@mnot.net>
Date: Mon, 2 Mar 2009 16:48:58 +1100
To: Roy T. Fielding <fielding@gbiv.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <1F925344-A5C9-415F-862F-A23C6048F6D8@mnot.net>
Catching up on this...

I think we're not actually that far apart, then.

People want POSTs to be reusable for GETs, and you appear to be saying  
that if we properly specify what it means for POST to be cacheable,  
and more importantly under what conditions something cached by POST  
can be reused, this is possible.

As such, we need to both document the requirements for what it means  
to call a method cacheable, and also better document what it means for  
POST to be cacheable. The same is true of GET, but hopefully that's a  
little more obvious...

However, you've said a few times that a POST response has to have a  
Content-Location to be considered cacheable. That's not documented  
anywhere, of course, and while it's one thing we could specify when we  
properly document POST cacheability, we should discuss this first.

Make sense, or have I misinterpreted something?

Cheers,



On 02/12/2008, at 12:37 PM, Roy T. Fielding wrote:

>
> On Nov 28, 2008, at 2:41 PM, Mark Nottingham wrote:
>
>> When the cache key discussion came up, it became clear that we  
>> needed to do some digging into the history of HTTP caching, which  
>> means looking at the mailing list of the original HTTPWG's caching  
>> sub-group. Unfortunately, I couldn't locate any online archives  
>> remaining, but Martin Hamilton kindly provided an mbox, which has  
>> been reconstructed at:
>>
>> http://lists.w3.org/Archives/Public/http-caching-historical/
>>
>> In looking through that, it's clear that there was discussion of  
>> POST caching, etc. early on;
>>  http://lists.w3.org/Archives/Public/http-caching-historical/1996Jan/0025.html
>>  http://lists.w3.org/Archives/Public/http-caching-historical/1996Jan/0026.html
>>  http://lists.w3.org/Archives/Public/http-caching-historical/1996Jan/0028.html
>>  http://lists.w3.org/Archives/Public/http-caching-historical/1996Jan/0030.html
>>  http://lists.w3.org/Archives/Public/http-caching-historical/1996Jan/0075.html
>>
>> (I believe this is before the difference between Location and  
>> Content-Location was specified, which is why Location is mentioned).
>>
>> But, no consensus was reached, as reflected by the state of the  
>> "updated issues list" (under "not agreed");
>>  http://lists.w3.org/Archives/Public/http-caching-historical/1996Feb/0114.html
>>
>> It did come up at a F2F, but was not "fully" discussed, and several  
>> aspects were deferred;
>>  http://lists.w3.org/Archives/Public/http-caching-historical/1996Feb/0039.html
>
> I addressed the relevant parts of that meeting (which I was not able
> to attend in person) in this post:
>
> <http://lists.w3.org/Archives/Public/http-caching-historical/1996Feb/0095.html 
> >
>
> The question boils down to the three cache models under Extensibility:
>
> > Larry described possible three ways to view an HTTP cache:
> >
> > 	a) a cache stores values and performs operations on these
> > 	values based on the requests and responses it sees.  For
> > 	the purposes of the cache, one can describe each HTTP
> > 	method as a transformation on the values of one or more
> > 	resources.
> > 	
> > 	b) a cache stores responses, period.
> > 	
> > 	c) a cache stores the responses to specific requests.
> > 	The cache must be cognizant of the potential interactions
> > 	between various requests; for example, a PUT on a resource
> > 	should somehow invalidate the cached result of a previous
> > 	GET on the same resources, but a POST on that resource
> > 	might not invalidate the result of the GET.
>
> The HTTP/1.1 proposal that Henrik and I developed was based on (c).
> HTTP is supposed to be more extensible than a storage interface.
> Our design decision was to make the messages self-descriptive
> rather than assume a prescriptive data model, thereby allowing
> efficient cache operation via message description on arbitrary
> methods.  It was a known trade-off versus the more traditional
> caching models of distributed file systems that could benefit
> from write-back caching by limiting the set and scope of
> resource-modifying operations to a shared data model.
>
> Rough consensus in both the WG and implementations was on (c), but
> that was not entirely reflected in the caching section that was
> added to the pre-2068 spec during the final revs.  The caching
> section left it out. The rest of the HTTP spec is based on (c).
> The visible difference between (a) and (c) is how cacheable
> responses to non-GET requests are enabled, which is defined in
> model (c) by the method semantics, response status code, and
> the response field-values for Cache-Control and Content-Location.
> It was not successfully defined by model (a).
>
> In other words, an HTTP cache must consider the method as part
> of the cache key if it allows caching of anything other than
> GET/HEAD responses.  An HTTP cache cannot do write-back operations.
> A response to a non-GET/HEAD request is cacheable if it says so
> in cache-control *and* the cache understands how to construct
> the cache key for that method (this is presumed to be defined by
> the method semantics). Any response that contains a Content-Location
> is cacheable as if it were a 200 response to GET if it can be
> trusted to be from the same authority as that location value.
> It follows, therefore, that a response to POST that includes
> both a cacheable Cache-Control and a Content-Location matching
> the POST request target is equivalent to saying that the enclosed
> entity contains what would be in the response to a GET on that
> same URI immediately after the POST completed.
>
> The HTTP/1.1 proposal was not designed to behave like a storage
> interface, so it's no surprise that it doesn't look like a CPU
> cache or even a disk cache.  Jeff tried to address that issue in
> his summary of the cache models.  I think that the subgroup
> discussion showed that model (a) did not fit the needs of HTTP.
> The subgroup's operating procedures at the time were that the
> existing HTTP/1.1 design would not be changed unless there was
> rough consensus for the change.
>
> ....Roy
>


--
Mark Nottingham     http://www.mnot.net/
Received on Monday, 2 March 2009 05:49:40 UTC