- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Sun, 5 Nov 2006 17:34:14 -0800
- To: Henrik Nordstrom <hno@squid-cache.org>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
It would make me feel better if proposals to change HTTP/1.1 were based on hard facts and not random conjecture. On Oct 22, 2006, at 5:22 AM, Henrik Nordstrom wrote: > sön 2006-10-22 klockan 04:35 +0100 skrev Jamie Lokier: > >>> I would say that if the value for "Vary" changes between to HTTP >>> requests, the sever implementation/configuration has somehow >>> changed, >>> and a proxy should invalidate all cached entries for that URI. >> >> No, no. > > Yes yes yes ;-) No, and you can read the specification to learn why. Vary is a statement by the origin server about how intermediate caches should behave in regards to *this* response. It is impossible for the cache to know anything about how the resource works or how responses to similar requests in the future might actually vary -- it is only responsible for obeying the origin server's wishes for *this* response, and *this* response remains valid for as long as it remains fresh. It is not the cache's responsibility to ensure that the server correctly implements Vary. It is not the cache's responsibility to exhaustively check every possible combination of client request headers. The Vary field states what the origin server cares about for *this* response. What it cares about for this response may be entirely different from what it cares about for the next response -- some responses are more specific than others, and some responses are more generic than others. >> The natural implementation is for the server to note each time a >> request header is examined to compute the response, and to emit Vary >> with those headers. > > True, but this creates quite a bit of a nightmare at the cache > level. So > with this requirement Vary: will still become equal to "no-store" in > most implementations, perhaps with hardcoded special cases for the > most > common uses or more likely caches trying to outguess the servers and > implementing their own content negotiation schemes. This simply > because > general caching of Vary entities then becomes too complicated to even > care trying to index the variants in the cache. Nonsense. It is a trivial linear algorithm of store and compare that has been implemented correctly in every implementation of HTTP/1.1 that has actually attempted to implement it. Even the lousy Microsoft DLL that turns off caching when Vary is present is a "correct implementation", even though it is absurdly inefficient. Others have done better. >> If you specify that a cache must purge all variants when receiving a >> Vary header which is different from previously received Vary, then >> servers will realistically have to send "Vary: Accept-Encoding, >> User-Agent" even in the case that the response _doesn't_ depend on >> User-Agent. > > Which is fine to me. Especially if the server supports ETag and > If-None-Match on larger responses. Flushing the cache is a correctness-preserving action for an HTTP intermediary, regardless of the contents of Vary. In other words, you will be compliant with HTTP even if your caching sucks. If you implement Vary as it is specified in RFC 2616, your implementation will be both correct and cache when appropriate. I don't see a problem here. >> However, when the server's use of request headers is less tightly >> coupled, it's _much_ harder to do that. > > True. > > So question then becomes multifold: > > 1) Is caching of Vary responses worth the effort to get it working > proper? Yes. > 2) If caching of Vary is desireable, what component of the network > should have to deal with the complexity involved? There is no complexity involved. An origin server makes its own choices about what is important to Vary upon, and can set the header field accordingly using any number of simple configuration mechanisms. A cache simply follows those instructions. > What we see today is that neither component really cares. Most servers > forgets to send Vary headers when they should, instead using no- > cache to > solve the problem. And most caches sees Vary too complex and reads it > the same as no-store, or in some user-agent cases reads Vary > wrongly as > "no-cache" (need to revalidate on every request) and additionally > getting the validation completely wrong. That is conjecture. Most caches implement Vary correctly or haven't been updated to HTTP/1.1 yet. The Microsoft client DLL implements Vary in the least efficient way, but no sane protocol designer will base an RFC on one of Microsoft's implementations when the rest of the world has no problem dealing with that feature. HTTP/1.1 offers "no-cache" as an option that may be used for any number of reasons, so its presence or absence has nothing whatsoever to do with Vary. There is no issue here. Vary works in many implementations and there has never been a single report of interoperability problems between clients and servers that have implemented Vary as specified. It is an integral part of HTTP/1.x caching that cannot be deprecated. Cheers, Roy T. Fielding <http://roy.gbiv.com/> Chief Scientist, Day Software <http://www.day.com/>
Received on Monday, 6 November 2006 01:33:54 UTC