- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Tue, 20 Feb 96 14:55:17 PST
- To: "Roy T. Fielding" <fielding@avron.ICS.UCI.EDU>
- Cc: http-caching@pa.dec.com
Regarding my summary of the Feb 2 1996 meeting, in which I wrote: > Issue: transparency vs. performance > > Since there have been numerous discussions of whether semantic > transparency or performance is the more important issue for HTTP > caching, we tried to come to a consensus on what we believed about > this. > > Here is a rough summary of our consensus: > > Applications in which HTTP is used span a wide space > of interaction styles. For some of those applications, > the origin server needs to impose strict controls on > when and where values are cached, or else the application > simply fails to work properly. We referred to these > as the "corner cases". In (perhaps) most other cases, > on the other hand, caching does not interfere with the > application semantics. We call this the "common case". > > Caching in HTTP should provide the best possible > performance in the common case, but the HTTP protocol MUST > entirely support the semantics of the corner cases, and in > particular an origin server MUST be able to defeat caching > in such a way that any attempt to override this decision > cannot be made without an explicit understanding that in > doing so the proxy or client is going to suffer from > incorrect behavior. In other words, if the origin server > says "do not cache" and you decide to cache anyway, you > have to do the equivalent of signing a waiver form. > > We explicitly reject an approach in which the protocol > is designed to maximize performance for the common case > by making the corner cases fail to work correctly. Roy writes: Let me again say that I adamantly oppose this decision. It doesn't reflect any of the applications that currently use HTTP, it is a mythical invention of the subgroup that such a thing is even desirable in all cases, and does a poor job of satisfying the user's needs. The reason that user agents are not always semantically transparent is because the user does not always want them to be semantically transparent. No matter what is in the protocol, no decision by the WG will ever change this fact of life. It is therefore WRONG to require in the protocol what cannot be achieved by any application -- all you are doing is requiring applications to be non-compliant. What you want is to enable the protocol to say "this is what you have to do to remain semantically transparent" and then require that applications default to semantic transparency mode. The former is what Cache-control does, and the latter can be added to the text. What we cannot do is control the user's application of HTTP technology; attempting to do so is foolish and contrary to the design of the Web. Requiring a visible/noticeable warning be presented when semantic transparency is disabled is reasonable, provided that it does not actively interfere with people's work. I'm in a tricky position here, since I am both the moderator of this subgroup (and hence nominally responsible for obtaining consensus), and also the primary proponent of the position that Roy so adamantly opposes. This is a no-win situation, because I've failed to change Roy's mind, he has failed to change mine, and there are explicit protocol specification decisions that apparently depend on resolving this contradiction. Therefore, this is something that we need to discuss at the IETF meeting in Los Angeles (Larry, are you listening?). Further, anyone who agrees with Roy on this issue ought to step up NOW and support his position. So far, by not disagreeing with my summary, the people who were at the meeting have implicitly approved it. Much as I would hate to lose this argument, it would be even worse if I won it because the rest of you were too terrified of contradicting me. :-) When Roy last raised this issue, I sent him a private response, which I think is worth forwarding to the subgroup, and so it follows below. -Jeff --------------------------------------------------------------- > Jeff wrote: > The proposed design uses opaque cache validators and > explicit expiration values to allow the server to control > the tradeoff between cache performance and staleness of the > data presented to users. The server may choose to ensure > that a user never unwittingly sees stale data, or to > minimize network traffic, or to compromise between these > two extremes. The proposed design also allows the server > to control whether a client sees stale data after another > client performs an update. Roy wrote: This is an incorrect design for HTTP caching. The cache does not exist on behalf of the origin server, and therefore any requirements placed by the origin server will always be secondary to those of the user. > Jeff wrote: > Server-based control is also important because HTTP may be used for a > wide variety of ``applications.'' The design of a Web application > (for example, a stock-trading system) may be peculiar to the server, > while Web browsers are generic to all Web applications. Because the > precise behavior of an application cannot be known to the implementor > of a browser, but can be controlled by the implementor of a server, > servers need to have the option of direct control over the caching > mechanism. Because the world is not perfect, we also need to give > users and browsers some control over caching, but this is at best a > contingency plan. Roy wrote: This is an incorrect assumption. The server is not capable of knowing the needs of the user, and it is the needs of the user that take precedence in the design of the WWW -- any other ordering results in systems that purposely defy the design in order to satisfy the user's needs. Therefore, the caching model MUST be defined according to the user's needs and only allow the server to provide input into the decisions made to satisfy those needs. This allow's the user to decide what is and is not correct behavior. This is the main conceptual disagreement between us, and a number of your other complaints derive from this. I'll start by pointing out that you are putting words into my mouth that I never wrote: of course the cache does not exist "on behalf of" the origin server, nor does it necessarily exist on behalf of the ultimate user. Caches exist to improve performance, and it's not zero-sum game. Users, servers, and intermediaries (such as Netcom or similar) can all benefit from caching, if it is done right. However (and this is the point where you are manifestly wrong), caches do not exist independently of semantics. Otherwise, I could write a cache that returns, say, a Dilbert cartoon, no matter what URL was requested. That's obviously an extreme breakdown in semantics, but to say that the "user's needs" define the semantics of an HTTP interaction is so ill-defined as to be entirely useless. Users DO have needs for things such as performance, availability, clarity of the UI, etc. But these are entirely orthogonal to whether the semantics of a request-response interaction are those intended by the origin server or not. In this respect, the user's primary "need" is that when he or she makes a request, the response has some semantically appropriate meaning. If the Web were simply composed of static (or slowly changing) documents, then the semantics of HTTP interactions would be trivial and one could easily let the user decide exactly what to do. But this is manifestly not the only thing the Web is used for, and probably no longer even the most prevalent. At the meeting on Feb. 2, for example, Shel Kaphan made it quite clear that the worst problem he faced in implementing his book-ordering service was the plethora of user-agent and cache implementations that blithely assumed they could decide when and when not to use a cached copy of some response. Simply put, the origin server MUST be able to control the semantics that the user sees, or else many obviously useful services cannot be implemented. What service authors are doing today is to go through extensive contortions to defeat caching, since they cannot trust the caches to get the semantics right. Only if we fix the HTTP protocol to give the origin servers the necessary level of control are we going to be able to get the full benefits of caching. As far as I can tell, everyone at the meeting understood and agreed on this point, and I have no evidence that anyone else in the caching subgroup disagrees. [Note added Feb. 20: not including Roy.] Of course, in a real-world system we cannot insist on full semantic transparency 100% of the time. So who gets to control what happens? Larry Masinter (in a private message to me) phrased the question as When Superman meets his evil twin, who can win, since they're both Equally Strong? When the Unstoppable Force meets the Immovable Object, who will win? These philosophical questions are pretty hard to answer in the abstract. The only resolution of this question is to sidestep it, and recognize that neither side can "win" at the expense of the other. Rather, the HTTP protocol should ensure that neither side loses when it comes down to preserving semantics. In other words, a cache does not relax the requirement of semantic transparency unless BOTH the origin server and the user agree to it. But because the ultimate semantics derive from the origin server, and not from the browser, the situation cannot be symmetrical. Only the origin server knows where "transparency" actually begins and ends, and so the origin server can be allowed to specify the "freshness lifetime" without input from the user. In other words, you have it 100% backwards when you say [The] caching model MUST be defined according to the user's needs and only allow the server to provide input into the decisions made to satisfy those needs. This allow's the user to decide what is and is not correct behavior. Rather, the caching model MUST be defined according to the semantics of the service, and only allow the user to provide input about how far to relax those semantics. This allows the origin server to decide what is and is not correct behavior. I defy you to explain, for example, how Shel Kaphan can make his book-ordering server work in your user-wins model.
Received on Tuesday, 20 February 1996 23:38:18 UTC