RE: Caching the results of POSTs and PUTs from Shel Kaphan on 1996-01-08 (http-caching-historical@w3.org from January 1996)

From: Shel Kaphan <sjk@amazon.com>
Date: Mon, 8 Jan 1996 11:09:26 -0800
To: Larry Masinter <masinter@parc.xerox.com>
Cc: http-caching@pa.dec.com
Message-Id: <199601081909.LAA13131@bert.amazon.com>
Larry Masinter writes:
 > I want to apologize in advance to not having studied all of the
 > http-caching mail -- I've been a little swamped, and I've really only
 > skimmed it. So perhaps this is out of date or confused, but...
 > 
Don't worry, I think we're all still a little confused.


 > ================================================================
 > Paul Leach wrote:
 >  > There are two cases of caching for POST that I can identify:
 >  > 1. The cache key is (Request-URI, Request-entity-body)
 >  > 2. The cache key is (Location-URI) from the response (modulo the 
 >  > spoofing issue).
 > and Shel added:
 > > What about case 3:  the cache key is
 > > 	(Request-URI, Request-entity-body, Location-URI)
 > > (BTW I also think the key for GETs should be (Request-URI, Location-URI)).
 > ================================================================
 > Isn't the 'Location-URI' some kind of red herring? If I ask for
 > 'foo.com' and you respond with a new Location header, I should *still*
 > have a cache entry for the original request, no?
 > 

I see your point.  Suppose that a cache receives multiple GETs for the
same request-URI, and that there is content-negotiation, so that
depending on all the headers in the request, different responses come
back.  One way to distinguish among the responses, each of which may
be cached separately, would be to use the Location-URI, which would
presumably be different.  This is, as you point out, different from
its use as a "key", however.

Because of the way the spoofing problem presents itself, no request
directly for that Location-URI could be responded to by any entry
under this request-URI (unless request-URI == location-URI), so the
Location-URI itself may be the wrong piece of information to use as a
"key", per se.  Instead, some summary of all the request-headers upon which
content-negotiation acts would have to be used.  By using Location-URI
as part of a key, the cache could keep the various entries under the
same request-URI separated, though it would not be useful for matching
on the request, which doesn't contain that piece of information.


 > I want to separate out the questions:
 > 
 > a) What is the cache key for a request?
 > b) What other cache entries might be generated by a response?
 > c) What other cache entries might be invalidated by a request or
 >    response?
 > d) What presumption in lieu of a cache-control method might be made as
 >    to the cachability of a request at all.
 > 
 > The cache key for a request is always a subset of the request
 > (including method, all of the headers and entity body); it is exactly
 > the subset that the response might be considered to vary by. We
 > presume that the response varies by Authentication and State
 > information, and we assume that the response does NOT vary by Accept
 > headers unless the response includes some indication of that variance.
 > 
sounds good to me.

 > In addition, we have some rules for some of the other methods:
 > - a successful PUT on a URL should invalidate all other entries for the
 >   same location (c), and may create a cache entry of GET on the same
 >   location.
 > 
OK, that's one way to implement it.  At least we're agreeing on what I
view as a "key" issue:  there is some linkage between the cache entries
that can be created or modified by requests on different methods
operating on the same URI.


 > - a POST on a location might invalidate a GET entry on the same
 >   location.
 > 
Yes.

 > - HEAD can be computed from GET caches with the same varying nature.
 > 
Yes.

 > - a Location: header in a response should either create a new cache
 >   entry for GET of that Location with no other headers in the cache
 >   key (state, authenticator, etc.)

As you were the first on this list to mention, there's a
spoofing issue if you allow the location-URI from a GET request to
create a cache entry with that location-URI as the key (or part of it).

    or at least invalidate any cache
 >   for GET of that Location (depending on whether the cache trusts
 >   the location).
 > 

Yes -- actually, I believe it is always safe to invalidate cache
entries, since the only thing you can damage is performance, not correctness.
So a cache doesn't have to be too smart about this.  Repeating myself
here a little, but the cache can skip the invalidation step if the
thing-to-be-invalidated has the same "cache-validator" as the new
version of the object.

--Shel
Received on Monday, 8 January 1996 19:30:14 UTC