Location, URI-header, etc. from Shel Kaphan on 1995-09-01 (ietf-http-wg@w3.org from July to September 1995)

From: Shel Kaphan <sjk@amazon.com>
Date: Fri, 1 Sep 1995 12:23:26 -0700
To: http-wg-request%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <199509011923.MAA02978@bert.amazon.com>
I was meaning to mention, though Koen beat me to the punch, that the
URI header has similar problems to the Location header regarding its
treatment by caches.

I have a number of comments and questions about the URI header:

First of all, the Location header is designated as a "response
header", whereas the URI header is designated as an "entity header".
Why should they be in different categories?

Would the URI-header ever be transmitted along with the body of a
document (in a 2xx response)?  If so, does it refer to the enclosed
document?  How so?  Or, like Location, does it refer to the place(s)
you could GET a document just like the enclosed one if you wanted
it again? The URI-header may refer to multiple versions of a resource,
but doesn't indicate which one is in the message.  How are you
supposed to tell, by analysing the other headers?  Is there enough
information?  What, specifically, is the relationship between a
URI-header and an enclosed entity?

In any case, it seems to me that the main utility of the URI-header as
a response header (as opposed to request) would be in a redirection,
where the first request is for a "generic" resource, and the URI
header in the 3xx response is in effect asking "OK, pal, which one didja mean?",
after which the client would choose one and make another request.

Would someone care to chime in with other interpretations?  (should be fun).

In any case, it seems that if a URI-header were construed as
describing an enclosed document in a 2xx response, it would be subject
to all the same problems as Location.

This leads me back to those problems with Location in 2xx responses.
We've heard a number of significant issues raised, and I'd just like
to try to lay them out clearly.  There are two entirely different
kinds of problems: security and reliability.

1. Security.  The security problem is that using Location it would be
easy to "spoof" a client into thinking that you were delivering a
resource that you in fact were not, just by lying about the Location
URI.  There is actually an easy solution to this: instead of asking
clients to *replace* items in their caches based on this URI, ask them
to *invalidate* items in their cache based on it (Yes, this has the
problem that Koen raised -- that caches may not notice/care about
Location.  See below).  This, then, would allow for some more
sophisticated cache management, while losing slightly on performance
compared to a replacement strategy, but not being subject to the
security hole.  The worst anyone could do is cause someone else's page
to get bopped out of a cache.

2. Reliability.  The reliability problem that would be introduced by
Location in 2xx headers would be different from the the current
situation in degree, not in kind.  It is already possible to get
anomolous, though technically "correct", behavior from caches that go
offline sometimes (described previously).  The same sort of problem
would be introduced if caches ignored or didn't recognize Location
headers and failed to invalidate other items as a result -- the cache
could occasionally serve up older, though not expired, documents.
Proper use of GET-IMS would vitiate this, as clients with newer
versions, when re-requesting a resource through a cache with an older
version, would force a newer version to be loaded.  All in, though,
one cannot deny the existence of a robustness problem, especially
during the period of time where many non-conforming caches exist.
However, HTTP version numbering would help this.

--Shel
Received on Friday, 1 September 1995 12:41:35 UTC