Re: Confusion about Age: accuracy vs. safety from Roy T. Fielding on 1996-08-29 (ietf-http-wg@w3.org from July to September 1996)

From: Roy T. Fielding <fielding@liege.ICS.UCI.EDU>
Date: Thu, 29 Aug 1996 15:41:37 -0700
To: Jeffrey Mogul <mogul@pa.dec.com>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9608291541.aa27287@paris.ics.uci.edu>
> Roy and I have disagreed in public for some time about what is "safe".
> By now, it's clear that we will never agree.  And we both believe that
> the other is using a completely bogus definition.
> 
> My definition "safe", as it applies to caching, is that there must
> never be situation where a stale result is presented to a user as
> if it were fresh.
> 
> Roy's definition of "safe" appears to be "don't waste any opportunities
> for caching, because this could cause additional network traffic.

Nonsense -- that is a foolish trivialization of my position.  The term
"safe", as it applies to software, is fully defined in the paper

        Nancy G. Leveson, "Software Safety: What, Why, and How",
        ACM Computing Surveys, 18(2):125-163, June 1986.

and a host of other Software Engineering texts.  It is hardly my fault
that you are focusing on one potential hazard at the exclusion of all
others, and thus your use of the term "safe" to describe your
interpretation of Age is just plain wrong.

Trying to be "safe" is an attempt to minimize the potential for harm given
the presence of any number of hazards.  Although in extremely rare cases
it might be unsafe to view a stale document (the hazard being that the
document on the origin server has changed and the user will be harmed
by not seeing the difference between the two), it is almost always unsafe
to disable caching on a congested network (the hazards being that users
who pay for bandwidth usage will be charged extra, and that when many
simultaneous requests are made on a line with limited bandwidth the
connection start-up overhead dominates the link and starves the requests
from completion).

> I'd question his use of the term "catastrophic network failure",
> especially since we've seen ample evidence on this mailing list during
> the past week or so that most HTTP caches would be hard-pressed to
> get hit rates above 40% to 50%, even if they were to cache EVERYTHING
> that didn't have a question mark or "cgi-bin" in the URL.

Oh please, caching is most effective during peak periods, and it is only
during peak periods that bandwidth is a concern.  Even at 30% hit rates,
caching can make the difference between a working network and one that
is clogged with connection requests.  This isn't just theory -- we have
seen it on the UK link!  HTTP/1.1 was designed with the anticipation that
such situations will be more frequent in the future, which is why I spent
so much time explaining hierarchical cache techniques to people who think
every HTTP connection goes straight to the origin.  In any case, judging
the effectiveness of HTTP/1.1 caching based on the limitations of HTTP/1.0
caches is not very conclusive.

> Since I'm not going to change Roy's mind, I'll limit myself to
> correcting the factual mistakes in his message, and I won't
> continue to rehash the "safety" argument.
> 
>     But you are completely overlooking what happens if ANY of the
>     intermediaries has a system clock which is out-of-sync with the
>     origin.  If ANY of them do have a bad clock, they will pass the
>     additional bogus age on to every single request that passes through
>     the proxy.
> 
> The algorithm in section 13.2.3 is specifically designed to handle
> the "one bad clock" (or even "N bad clocks") case.  It does this
> using the following two steps:
>       apparent_age = max(0, response_time - date_value);
>       corrected_received_age = max(apparent_age, age_value);
> (see the draft for the definitions of the variables).  Because
> of the two max() operations, this will NEVER result in underestimating
> the Age, as long as at least one of the clocks in the chain is correct.

The change that I suggested would also prevent underestimating
the age provided that either

   a) The content did not come from an HTTP/1.0 proxy, or
   b) The user agent and origin server have good clocks.

Again, the result will always be accurate if the proxies are HTTP/1.1.
Since proxies will be the first to change to HTTP/1.1, my interpretation
of the Age algorithm will eventually be completely safe.  In contrast,
your interpretation will always be unsafe, even when HTTP/1.0 is gone.

> Yes, it may overestimate the Age value, if one or more of the clocks is
> badly out of sync.  If you agree with Roy that overestimation is evil,
> then I suppose the word "bogus" is appropriate here.  Otherwise, it's
> exactly what the algorithm is intended to do.

No, the algorithm is intended to provide an accurate minimum of the
age given the presence of any bad clocks.  Forcing all caches to add
Age to an entity that hasn't been aged by them does result in the
algorithm being dependent on EVERY SINGLE CLOCK on the response path.
That breaks Age and introduces a security hole to any large network
which depends on caching to reduce their bandwidth requirements.

 ...Roy T. Fielding
    Department of Information & Computer Science    (fielding@ics.uci.edu)
    University of California, Irvine, CA 92697-3425    fax:+1(714)824-4056
    http://www.ics.uci.edu/~fielding/
Received on Thursday, 29 August 1996 15:52:09 UTC