Caching the results of POSTs and PUTs from Jeffrey Mogul on 1996-01-04 (http-caching-historical@w3.org from January 1996)

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Thu, 04 Jan 96 14:30:02 PST
To: Paul Leach <paulle@microsoft.com>
Cc: http-caching@pa.dec.com
Message-Id: <9601042230.AA10686@acetes.pa.dec.com>
    Suppose I want to provide a "sine()" service -- you tell me an angle in 
    degrees, and I tell you the sine of that angle.  I could implement this 
    so that a POST to http://www.sine.com of an entity containing (e.g.) 
    "x=0" would return an entity containing "y=0". It would be perfectly 
    OK, for the sine service, to cache this result, and reply to the next 
    POST to http://www.sine.com of the entity "x=0" with the "y=0" entity.
    
    In order for a proxy to take advantage of this, it needs some 
    indication that its OK to cache the result. Expires: and Cache-Control 
    aren't quite enough -- they would say that the URI returned in the  
    Location: header could be used to as a cache key for the returned 
    entity value, but not obviously that the same query would return the 
    same result.

I think I'm beginning to understand, but I need a little more help.
I can see two ways to think of caching the result of a POST:

    (1) The cache key consists of the URI (let's ignore content
    negotiation for now).  A cache storing the response to a POST must
    also store the entity body from the corresponding request.  It may
    return the cached response to a subsequent request only if the new
    request has the same entity body.  In other words, the cache can
    hold at most one cached POST-response for a given URI.

  or

    (2) The cache key consists of the (URI, POST-request-body) tuple.
    A cache can store multiple POST-responses for a given URI, by
    disambiguating them using the request bodies.

I suppose it doesn't really matter, from a protocol point of view,
which of these cache-lookup approaches the cache takes. 

In either case, caching follows the same rules as for GET responses:
the server provides a fresh-until time, and the cache must validate
non-fresh entries with the origin server.

Validation could be done using a conditional POST.  A conditional POST
has the same form as a normal POST (including the entire entity body),
but includes the cache-validator returned by the server in its earlier
response.  The meaning of a conditional POST is "look at the URI,
entity body, and validator in this request: if you would give me
the exact same response as you gave before, including the same
validator, then just tell me '304 Not Modified'; otherwise, do a
normal POST for me."

Does this make sense?  It seems like this is what Shel and Paul
are trying to tell me, anyway, and I think it would work.

Note that this still follows my proposed rule that write-through
is mandatory, in the following sense: if the server has granted
permission to cache a value (whether from a POST or a GET) for
some period, using the fresh-until header, then it's giving up
any hope of imposing cache consistency for that duration. 
If the server does not grant this permission, then every POST
request causes an interaction with the origin server (although
the response entity body may not have to be transmitted over the
entire response chain).

In any case, if *new* data is being POSTed, this data is always
sent directly to the origin server (because the cache-lookup
rules would not match in this case).

In the case of a PUT, we can probably add this optimization:
the cache may store the request's entity-body together with
the cache-validator for the server's response, and use this
to respond to subsequent GETs of the resource.  This is because
a PUT is supposed to replace the resource with the specified
entity body.  The server may override this behavior with an
explicit Cache-control: no-cache.

I would recommend that the origin server should give a fresh-until value
of zero in the PUT response, meaning that the cache will have to
validate the entry each time before using it in a response.  This
is because a PUTable resource may be changed via several paths, and
any blind caching could lead to update inconsistencies.  However,
this still avoids transmitting the actual entity-body all the time,
until it changes.

-Jeff
Received on Thursday, 4 January 1996 22:39:02 UTC