v11-03 COMMENT: 16.2.4 Age Calculations

I have several problems with section 16.2.4.

Problem 1: errors in the formulae
----------

In the summary of the calculations:

      apparent_age = max(0, now - date_value);
                            ^^^
      corrected_received_age = max(apparent_age, age_value);
      response_delay = now - request_time;
                       ^^^
      corrected_initial_age = corrected_received_age + response_delay;
      resident_time = now - response_time;
                      ^^^
      current_age   = corrected_initial_age + resident_time;

the first two `now' values denote the time of receipt of the response
by the proxy.  The last `now' denotes the time at which the response
is served by the proxy from its own cache memory.

A fix is to replace the first two `now's by `response-time'.

Problem 2: knowing when a response is firsthand
----------

The note at the end of the section reads:

  Note that a client can usually tell if a response is firsthand by
  comparing the Date to its local request-time, and hoping that the
  clocks are not badly skewed.

This is misleading, there is a much easier way to know this, though it
only works if there is a 1.1 cache in the chain:

  Note that a client can tell if a response is firsthand by
  checking if it contains an Age header.  Firsthand responses never
  contain an Age header.

By the way, I can't find any clear reference to this important
function of the Age header anywhere in the draft.  I'll post a
separate note about this.

Problem 3: the `now - date_value' correction will not always be used
----------

The section says:

 An response's age can be calculated in two entirely independent ways:

  1. now - date_value, if the local clock is reasonably well
     synchronized to the origin server's clock. If the result is
     negative, this is replaced by zero.

[...]

  2. age_value, if all of the caches along the response path implement
     HTTP/1.1.

 Given that we have two independent ways to compute the age of a response
 when it is received, we can combine these as

       corrected_received_age = max(now - date_value, age_value)
                                ^^^^^^^^^^^^^^^^^^^^

 and as long as we have either nearly synchronized clocks or all-HTTP/1.1
 paths, one gets a reliable (conservative) result.

 Note that this correction is applied at each HTTP/1.1 cache along the
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 path, so that if there is an HTTP/1.0 cache in the path, the correct
 ^^^^
 received age is computed as long as the receiving cache's clock is
 nearly in sync. We don't need end-to-end clock synchronization (although
 it is good to have), and there is no explicit clock synchronization
 step.

I dare to predict that most 1.1 caches will in many cases use not use
the correction

       corrected_received_age = max(now - date_value, age_value)

but will use

       corrected_received_age = age_value

instead because the max( ) correction would often, because of clock
skew, needlessly decrease the freshness lifetime left without there
being any need for this from a general correctness standpoint.  

Note that it is always safe _not_ to use the max( ) correction when
getting a response directly from the origin server or when getting the
response directly from another 1.1 cache.

When getting the response from a 1.0 cache, a 1.1 cache _will_
probably do a correction, but this may not be 

       corrected_received_age = max(now - date_value, age_value)

because this does not even work correctly in all situations (if the
origin server clock is one day ahead for example).

There are techniques that work much more reliably and efficiently.
For example, the proxy could do a single HEAD request directly to the
origin server, and use the Date header in the response to calculate a
pessimistic value for the time difference between the origin server
clock and its own clock.  Using this difference, the cache could make
very good age corrections for subsequent responses from a 1.0 cache
for resources on that origin server.

In summary, I feel that 16.2.4 needs to be rewritten in such a way
that 1.1 caches are not obliged to make the max( ) correction which is
in there now.  If this is not done, the spec will require caches to do
something that almost no cache will do.  A spec which requires things
that almost nobody will do is not a good spec.

I would be prepared to do a rewrite of this section if Jim Gettys asks
me for one.


Problem 4: almost nobody will use NTP
----------

The section says:

 All HTTP
 implementations, but especially origin servers and caches, should use
 NTP [RFC1305] or some similar protocol to synchronize their clocks to a
 globally accurate time standard.

I dare to predict that almost no HTTP/1.1 origin server will use NTP.
A spec which requires things that almost nobody will do is not a good
spec.  Aside from that, globally synchronized clocks aren't even
needed that much to make caching work reliably: see above.

Problem 5: section is too verbose
----------

I estimate that a rewrite could cut the length of the section in half
without affecting its content.

I would be prepared to do a complete rewrite of this section if Jim
Gettys asks me for one.


Koen.

Received on Friday, 17 May 1996 15:23:42 UTC