Re: On transparency from Koen Holtman on 1996-02-24 (http-caching-historical@w3.org from February 1996)

From: Koen Holtman <koen@win.tue.nl>
Date: Sat, 24 Feb 1996 16:54:55 +0100 (MET)
To: fielding@avron.ICS.UCI.EDU (Roy T. Fielding)
Cc: mogul@pa.dec.com, http-caching@pa.dec.com
Message-Id: <199602241554.PAA00202@wswiop05.win.tue.nl>
Roy T. Fielding:
>
>You do see, on EVERY caching browser known to me, the options
>to "check once per session" and "never check".

Data point: Mozilla's "check once per session" option does not do what
it says it does: if the response is expired, mozilla _will_ do a
conditional GET if the user clicks the link again.  So the "check once
per session" only influences the heuristics if Expires is absent.
(This is for "Mozilla/1.12 (X11; I; Linux 1.2.9 i486)", I have not
checked other versions.)

The "never check" option does indeed do just that, even if the
response is expired.

>Your summary of the
>meeting consensus would make both those options non-compliant with HTTP/1.1.

Only "never check" would be non-compliant.  And the state management
subgroup _needs_ it to be non-compliant with HTTP/1.1, because you
can't reliably use cookies to build stateful services under the `never
check' behavior.

Actually, you _can_ reliably use cookies under the `never check'
behavior, by using various `cache busting' techniques that have a
disastrous impact on the efficiency of all caches.  Anyway, I've said
this many times before.

As someone who implements stateful services, I could live with `never
check' being allowed under HTTP/1.1, _but only_ if the user agent
signals this to my server, so that I can give the user an appropriate
warning message:

 `Please enable document verification in your browser caching setup,
  as the correct working of this service depends on it'.

>[Jeffrey Mogul:]
>> If the Web were simply composed of static (or slowly changing) documents,
>> then the semantics of HTTP interactions would be trivial and one could
>> easily let the user decide exactly what to do.  But this is manifestly
>> not the only thing the Web is used for, and probably no longer even the
>> most prevalent.  At the meeting on Feb. 2, for example, Shel Kaphan made
>> it quite clear that the worst problem he faced in implementing his
>> book-ordering service was the plethora of user-agent and cache
>> implementations that blithely assumed they could decide when and
>> when not to use a cached copy of some response.
>
>Shel is wrong.  The problem is in his own application assuming that
>the user will always want a stateful session.

His application can assume that the user wants a stateful session,
because his application is about filling a shopping basket with books,
which is inherently stateful.

>  There is nothing that the
>working group can do to change the needs of people who use HTTP, and one
>of those needs is to cache responses regardless of the nature of those
>responses.

Roy, _you_ may be able to know what can go wrong with stateful
services when you let your browser or proxy ignore Cache-Control
directives under 1.1, and _you_ may be able to always resolve the
situation with judicious use of the reload button, but you can't
expect that the general public will be able to.

Authors of stateful services have the need to protect users who use
software with weakened caching restrictions from shooting themselves
in the foot.  For commercial stateful services, both the law and the
marketplace demand that this protection is provided by service
authors.  You may not need this protection, but you are the exception
rather than the rule.

If HTTP/1.1 won't address this need of commercial stateful service
authors, HTTP/1.1 will not be viable as a protocol for online
commerce, and we will probably end up with Microsoft dominating this
area.

>The reason why this is a problem today is because cache's have to guess
>about what the origin server wants.

Wrong.  The cause of Shel's worst problem is that in todays web, the
caching restrictions used depend entirely on the context of the user,
without there being an easy way for service authors to determine what
the context of the user is.

>  The Cache-control header field
>removes the need to guess, and thus will result in more reliable behavior
>under normal circumstances.

Not only the cache needs to guess, the server has a far greater need
to guess correctly what the cache will do, because the service
provider runs the risk of being sued by customers if the guess is
wrong.

OK, now for a constructive proposal.

If you insist that HTTP/1.1 must make it legal for user agents and
proxy caches to ignore a Cache-Control response header, then I insist
that user agents and proxies always warn origin servers about doing so
by including a particular header in every request:

----snip----

xx.yy Cache-Warning

  The Cache-Warning request header must be added to requests by user
  agents or proxies that are configured to possibly ignore the
  contents of Expires and Cache-Control response headers received.
  This header can inform origin servers of caching restrictions along
  the response chain that would prevent a certain service from being
  provided in a reliable way.

      Cache-Warning = "Cache-Warning" ":" 1#warning-directive

      warning-directive =   "may-cache" [ by-proxy ]
                          | "min-age" "=" delta-seconds [ by-proxy ]
                          | "may-not-refresh" [ by-proxy ]

      by-proxy = "by-proxy" URI [ "(" product ")" ]

  The "may-cache" directive signifies that a client in the response
  chain may ignore a "no-cache" Cache-Control directive received in
  the response.  The "min-age" directive signifies that a client in
  the response chain may assign the given minimal lifetime to the
  response, if this response is assigned a lower lifetime by any
  "max-age" directive in a Cache-Control header or an Expires header.
  The "may-not-refresh" directive indicates that the client may not
  refresh some responses from resources on the origin server that
  become stale in future, but return the stale responses instead.  A
  by-proxy part must be included if the client is a proxy.  Examples
  are

    Cache-warning: may-cache, may-not-refresh

  which can indicate that the user agent is in a mode in which the
  showing of old responses is always preferred over the generation of
  network traffic, and

    Cache-warning: min-age=60 by-proxy http://info.cern.ch:8000/

  which may indicate that this proxy is configured to cache the
  response for a minimum time of 1 minute.

----snip----

> ...Roy T. Fielding

Koen.
Received on Saturday, 24 February 1996 16:21:46 UTC