short-time client side caching and clock skew from Jorrit Jongma on 2010-05-27 (ietf-http-wg@w3.org from April to June 2010)

From: Jorrit Jongma <Jorrit@Jongma.org>
Date: Thu, 27 May 2010 13:29:46 +0200
To: <ietf-http-wg@w3.org>
Message-ID: <013a01cafd8f$e9a11880$bce34980$@org>
After running into some caching issues I started reading sections 13.2.3 and
13.2.4 of RFC 2616, namely "Age Calculations" and "Expiration Calculations".

I noticed how clock-skew appears to be a major issue in these calculations.
It is not 1999 anymore, and many sites now heavy monster loads including
lots of tiny AJAX calls. In many instances, these could be cached for small
amounts of time - consider for example 1 minute or 5 minutes - but generally
not for say half an hour or more. If this could be properly achieved, this
could improve site response speed and reduce server loads.

However, as I interpret it, due to how current_age is calculated in 13.2.3,
short-period client-side caching is essentially impossible unless both
server and client are NTP-synced. If either the server clock is behind
(and/)or the client clock is in front of NTP time, current_age will
generally exceed freshness lifetime due to the discrepancy. In reverse, if
the server clock is in front (and/)or the client clock is behind NTP time,
the object will be cached way beyond it's intended freshness_lifetime. The
problem does not show so much with longer freshness_lifetimes as the
discrepancy becomes less significant as freshness_lifetime goes up.

Even if a non-local proxy/cache is used that supplies an Age header that
would be correct (excluding transit time), the problem would still appear
for the case where the client clock is fast due to corrected_received_age =
max(now - date_value, age_value).

I'm not sure if it is any way possible to fix or correct this problem (or if
that is even something HTTPbis cares about), though I did have some ideas
that MIGHT be effective, though I am probably overlooking an important
factor:

(1) The Age header, if present, SHOULD override (now - date_value), instead
of taking the max of both of them in corrected_received_age. While this
ignores transit time, I would say it is safe to assume that in most cases
transit time is an order of magnitude smaller than freshness_lifetime (even
if that is only 1 minute) and would thus generally be insignificant. As
servers SHOULD be NTP-synced I would also say that in general the
reliability of the Age header sent by proxies/caches is higher than the
reliability of the client clock.

(2) Instead of using origin date to calculate the current_age, client
receive date could be used. This would result in: current_age = (now -
original_local_receive_date + age_header), and removes the dependency on a
remote clock and thus eliminates the discrepency. Again this ignores transit
time. (in case of Expires headers the freshness_lifetime would of course
still be calculated using the difference between the Date and Expires
headers)

Both would only apply to HTTP/1.1 responses, as HTTP/1.0 servers do not send
the Age header. I'm not sure if it is possible to have an HTTP/1.1 origin
server, a HTTP/1.0 proxy, then a HTTP/1.1 proxy, and then receive a HTTP/1.1
response, if so that could possibly pose a problem in this scenario.

Looking at ticket #29, this does nothing to alleviate the problem.

Just thinking out loud here, and not quite sure why this calculation was
made in this way in the first place - any comments?

Regards,
Jorrit
Received on Thursday, 27 May 2010 11:30:26 UTC