- From: Koen Holtman <koen@win.tue.nl>
- Date: Sat, 19 Aug 1995 17:58:57 +0200 (MET DST)
- To: Roy Fielding <fielding@beach.w3.org>
- Cc: koen@win.tue.nl, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Roy Fielding: >[Koen Holtman:] >>Such a `tuned' cache would introduce incorrectness > >No, it *might* introduce incorrectness. I claim that over 99% of >today's non-cacheable pages are still "correct" after being cached, >where correctness is defined as containing the same substantive and >qualitative information content as would be obtained directly from >the origin. Still "correct" after being cached for how long? 30 seconds? 5 minutes? For the duration of the user agent session? 1 day? 30 days? 1 year? If you say 30 seconds, I would agree with your 99% estimate (but only if your `pages' do not include responses to POST requests). For 30 days, my guess is about 50%. But anyway, saying that 99% of the pages will remain correct even if caches are `tuned' is besides the point: it is the possibility of getting incorrect pages that counts: if the web is an unreliable communications medium, this rather restricts the kind of content it can carry. >That is why a cache administrator is willing to "tune" the cache, >and it is not something that can be fixed within the protocol. Be fixed, no. Be taken into consideration, yes. >Caches will do what providers want them to do when providers stop >marking cacheable pages as non-cacheable. Providers are not one group. There are at least two groups: - Static content providers, serving pages that can always be cached. - Dynamic content providers, serving pages that cannot be cached (or cannot be cached for more than N minutes/hours/days). Static providers can afford to mark their cacheable pages as non-cacheable for frivolous reasons: if this drives some proxy administrators to `tuning', they only feel a slight dip in hitcounts, nothing more. What have they got to loose? The web currently has almost no _dynamic_ providers, because they need reliable caches (or at least a way to detect unreliable caches), and they cannot get reliable caches. A dynamic content provider cannot put a `tornado warning, updated every 10 minutes' page on the web if it cannot be guaranteed that this page will never be cached for longer than 10 minutes. Bad things will happen to unsuspecting users otherwise. A web-shop cannot put a `shopping cart' application on the web if the (dynamic) shopping cart contents may be inappropriately cached. A customer pressing `buy' on an outdated shopping basket page may sue the web-shop after receiving the wrong products. A `war' between proxy administrators and static content providers will prevent dynamic content providers from using the web (unless they have huge legal departments). Thus, such a war prevents the web from growing beyond the static, read-only medium it is now. >>Pragma: min-age=<delta-seconds> > >I am opposed to this change. First, any sensible cache administrator >would never send such a header, since it effectively changes the request >profile and therby lowers the quality of the response. Sending the header would allow dynamic content providers to start using the web. This will greatly increase the quality (or at least the diversity) of web content. A `tornado warning, updated every 100 minutes' response, put out by a tornado warning server after getting a `Pragma: min-age=6000' header in the request, has a much higher quality than a 90-minutes old `tornado warning, updated every 10 minutes' response from a `tuned' proxy cache. You call it `sensible' for a cache administrator managing a `tuned' cache not to tellthe world it being tuned. I strongly disagree with this. Not all actions by cache administrators can be morally justified by the existence of selfish server administrators. Undetectable `tuning' is a very selfish thing to do. If 10% of cache administrators do undetectable tuning, they will keep dynamic service providers off the web, and spoil the fun for all people behind the 90% of working caches as well. This is why I propose pragma: min-age. If 10% of cache administrators insist on fighting a spec non-conformance war with static service providers, at least they should ensure that the other 90%, who put up with more IP traffic in the hope of getting dynamic web services someday, do not become victims. A protocol extension cannot make the 10% of tuned caches go away, but it can reduce their harmful effects. >Second, the cache is tuned based on each individual response, not based >on every request. If the response looks like it shouldn't be cached, >a tuned cache will not cache it. I think that a Pragma: min-age request header is an adequate solution to tuned cache detection, because I don't expect tuned caches to become that common. A much more refined mechanism is not needed in my opinion. A Pragma: pragma-no-cache-coming-from-selfish-provider-ignored response header, generated by proxies, looks more refined, but has a big disadvantage: it may cause the `warning: this page may be outdated' messages by browsers to become so common that users simply always ignore them, even if they are accessing a page that really is autdated. This will leave dynamic service providers with the same problem they have now. > If it says "no-cache" and is coming >from a reputable provider, a tuned cache will not cache it. If it >comes from a notoriously wasteful provider, the cache will cache it >and there is no incentive whatsoever for the cache administrator to >warn the wasteful provider ahead of time. Technology to distinguish between `reputable' and `notoriously wasteful' providers is not available now, and I don't have very high hopes of it being feasible, available, and used anytime soon. If proxy administrators go through the trouble at all of making any distinction between reputable and wasteful providers, I fear that the default will be to assume that a provider is wasteful. If (new) dynamic service providers cannot count on getting a warning from tuned caches, they will simply stay off the web. Also, if a known wasteful provider is not warned, how is the provider to know cache administrators disapprove of him? If anything, wasteful providers need to get as many warnings (=complaints) as possible. A wasteful provider responding to these warnings by generating pages with `one-time-urls' or other cache busters would immediately show up on the `wastefulness detector' you assume present: a proxy administrator could then decide to stop talking to the provider altogether. _________ Jeffrey Mogul: >I agree with Koen that some proxy implementors (or managers) may want >to violate the protocol spec, for reasons of their own. Just for the record, Roy seems to be much move forgiving of such violations than I am. > But this >is outside the realm of the HTTP spec, and I agree with Roy that >the spec should not be twisted to support it. This problem is outside the realm of the current draft HTTP spec (in that a spec can never contain a mechanism to support spec violations), but not outside the realm of HTTP spec _development_. If the goal of HTTP spec development is to allow for the growth of the web, the next spec should support the needs of dynamic service providers. Dynamic service providers need Pragma: min-age, or some equivalent warning mechanism, so it should be in the next spec. Putting Pragma: min-age in the next spec will, if it is done right, not make the spec self-contradictory. Like the Accept-Encoding header, Pragma min-age would just be a mechanism for a client to express that it only supports a subset of the defined potential http functionality. Koen.
Received on Saturday, 19 August 1995 09:01:19 UTC