Re: HTTP Caching Design

Koen mentioned:

> ...  I would like to propose a rule in 1.1 that a
> response without a valid Expires/Fresh-until/Cache-control/whatever
> header may not be served unvalidated from a cache if more than 7 days
> have past since the previous validation.

I have no problem with including a set of heuristics in the spec for
normal cache behavior, with some maximum criteria as well.  However,
it should also take into consideration things like user-defined requirements
for cached responses (such as the "never check" mode of Netscape).
Jeff's draft fails to consider these because the definition of
"valid" does not consider what the user and provider want the cache to do.

> - (2.11) If the warnings are in a response header, they should not use
>   numbers from the Status-Codes space.

Agreed -- it would lead to confusion over whether the warning should be
in a header or as a separate message.

> ...
>   draft caching features. (Some examples of unplugged holes:
>   description assumes global clock,

Because lack of a global clock has no significant effect on the algorithm.
What the description lacks is the explanation of why that is true and
the set of requirements to make it obvious what the correct behavior
would be.  What is missing from the spec is:

    Date is required on all HTTP/1.1 responses, and represents the time
    at which the origin server generated the message (or the last refresh
    of that message as received from the origin server in a 304 response).

    Max-age is measured relative to Date *and* the cache's internal
    clock at time of message receipt -- the cache should use whichever
    time is least.

    For the purpose of freshness, Expires is just max-age = (Expires - Date)
    if no max-age is given.

Let's consider the options:

 a) OS-clock = Cache-clock
      ---> no problem

 b) OS-clock > Cache-clock
      ---> OS-clock + max-age > Cache-clock + max-age
      ---> cache will consider it stale max-age seconds after receipt

 c) OS-clock < Cache-clock
      ---> OS-clock + max-age < Cache-clock + max-age
      ---> cache will consider it stale (max-age - (Cache-clock - OS-clock))
           seconds after receipt

Although (b) allows for the possibility that a more-than-two cascaded cache 
would introduce some additional time gap, I don't think that this can be
avoided using any scheme.  Similarly, (c) will cause poorly clocked systems to
prematurely age a response, but that is a fail-safe condition and can't
be avoided.  Finally, since all but the User-Agent should never have a
clock skewed more than 10 seconds while connected to the Internet, the
possibility of cache-skew problems is limited.

>   cascaded proxy caches and max-age,

Not a problem, for the same reason.  Date is never changed by proxies.

>   failure to be more explicit about (the allowedness of) caching
>   heuristics when cache-related response headers are absent.)

Hmmm, yes, that is the result of never writing the section on Caching.

>>  I have also
>>yet to see an example of cache usage/requirements that has not already
>>been met by the HTTP/1.1 design.
> 
> I have an example. Currently, if servers use authentication, responses
> can never be cached.  See Section 10.6 in the 1.1 draft:
> 
>  # Responses to requests containing an Authorization field are not 
>  # cachable. 
> 
> There exists a great number of servers that use authentication just to
> recognize different users, not to protect content from being accessed
> by everybody.  www.wired.com and www.dds.nl are two examples I know
> of.  Almost all of the responses from these servers are not secret,
> and highly cachable.  Yet, there is no easy way for these servers to
> get the cachable responses cached.  I believe that wired.com uses a
> second server for the inline pictures, just to get around the
> restriction in Section 10.6.

Yes, that is a conflict with the HTTP/1.1 draft 00 (there are probably
others, since it was not finished in any sense).  I have no problem with
changing the above restriction appropriately -- same goes for the default
requirement not to cache other methods.

> I therefore propose (again, I also did this somewhere in the summer) a
> 
>  Cache-control: public
> 
> response header that could be used to override the restriction in
> Section 10.6.

Why not just use the existing options:

   Cache-control: cachable
or
   Cache-control: private

Hmmm, come to think of it, "public" would be better than "cachable"
in any case.

......Roy

Received on Wednesday, 10 January 1996 15:17:34 UTC