input on HTTP for the year 2000 working group

At the April IETF meeting I volunteered to coordinate input for the
year 2000 working group on the various "information services and file
management" standards.

Thanks to Jeffrey Mogul <mogul@pa.dec.com> who started this
discussion off.  See
	http://www.ics.uci.edu/pub/ietf/http/hypermail/1997q2/0050.html
and the subsequent discussion.  Sorry it has taken me so long to respond.


Using local data I have confirmed his results that about 20% of the
server responses sent thru a proxy server currently use RFC850 dates
with two-digit year fields.

Here is some preliminary text for the draft that the 2000 working
group is doing.  If nothing else I would like people to think a bit
more about some of the scenarios for what might go wrong based on
these ambiguous dates.  I suggest some possible minor additions to the
advice in RFC2068 to avoid unnecessary delay and bandwidth caused by
unnecessary requests.


Cheers,

Neal McBurnett <nealmcb@bell-labs.com>  503-331-5795  Portland/Denver
Bell Labs / Lucent Technologies
http://bcn.boulder.co.us/~neal/      (with PGP key)

----------
Information Services

HTTP

The main IETF standards-track document on the HTTP protocol is RFC2068
on HTTP 1.1.  It notes that historically three different date formats
have been used, and that one of them uses a two-digit year field.  In
section 3.3.1 it requires HTTP 1.1 implementations to generate this RFC1123
format:

     Sun, 06 Nov 1994 08:49:37 GMT  ; RFC 822, updated by RFC 1123

instead of this RFC850 format:
     Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036

Unfortunately, many existing servers, serving on the order of
one fifth of the current HTTP traffic, send dates in the ambiguous
RFC850 format.

Section 19.3 of the RFC2068 says this:

  o  HTTP/1.1 clients and caches should assume that an RFC-850 date
     which appears to be more than 50 years in the future is in fact
     in the past (this helps solve the "year 2000" problem).

This avoids a "stale cache" problem, which would cause the
user to see out-of-date data.

But to avoid unnecessary delays and bandwidth indicated in Scenario 2
below, this should be extended to say that a date which appears to be
more than 50 years in the past may be assumed to be in the future, if
a future date is legal for that field.

Scenario 3 indicates that servers may also want to follow these rules.


Here is some more background and justification for these arguments.

The following headers use full dates:

HTTP/1.0:
	Date:
	Expires:		# can be in the future
	If-Modified-Since:	# required to be in the past
	Last-Modified:		# required to be in the past
	Retry-After:		# can be in the future, also takes
				# relative time - number of seconds

HTTP/1.1:
	If-Range:
	If-Unmodified-Since:	# required to be in the past

Note that clock skew between hosts can lead to confusion here - see
the RFC for details.

Here are some scenarios of the implications of RFC850 dates, which
include stale caches, unnecessary requests for things which are
validly cached, delays for the user, extra bandwidth, and presenting
incorrect information to the user.

Some cases involve comparisons with the current time, and others may
involve comparisons between dates from different sources.  The
abbreviation "/99" is used to imply an RFC850 date with the value "99"
for the year.


RFC850 date from server

Scenario 1:
	If a client gets an Expires /99 date after the year 2000, it
	should interpret it as 1999, to avoid ending up with a stale
	cache entry.

	This is as already specified in RFC2068.

Scenario 2:
	If a client gets an Expires /00 date before the year 2000, and
	subsequently is faced with a choice to either retrieve the
	document from its cache or look for an updated copy, it may
	interpret it as the year 2000, to avoid the unnecessary delay
	and bandwidth of an extra request.


RFC850 date from client

Scenario 3:
	If a server gets an If-Modified-Since /99 date from a client
	after the year 2000, it should interpret it as 1999 when
	comparing with the local modification date, in order to
	possibly avoid sending a full GET response rather than a HEAD
	response.

	Note that an If-Modified-Since header must never be in the
	future.


(Note that there has been a lot of discussion about whether
If-Modified-Since should really be compared with "less than" semantics
or whether the date should be considered a 'token' which is required
to exactly match the Last-Modified date in order to avoid a full GET
response.  There have been bugs and problems with implementations that
deal badly with dailight savings time, etc.  I'm not sure how this
debate dealt with the question of how to convert the dates, or whether
servers can actually use "less-than" semantics.  Input is
requested....)



Dates in HTML

In RFC1866, on HTML 2.0,the <META> tag allows the embedding of
recommended values for some HTTP headers, including Expires.  E.g.

    <META HTTP-EQUIV="Expires"
          CONTENT="Tue, 04 Dec 1993 21:29:02 GMT">

Servers should rewrite these dates into RFC1123 format if necessary.

Received on Tuesday, 15 July 1997 15:32:56 UTC