Re: Still trying to make sense of HTTP caching model

Paul says:
    John says:
    ] A final word on this last point.  The HTTP protocol can presumably
    ] legally define any header to have any semantics.  But we need to keep
    ] in mind that most Expires headers (and most Cache-Control headers
    ] in the future) are set by server maintainers.  The fraction of server
    ] maintainers who ever have or ever will read the protocol is
    ] negligible.
    
    This is a good point -- or at least one I didn't understand at all. I 
    didn't think of Expires as a persistent document property.  I.e.,  I 
    wasn't thinking that the admin would set an Expires time, but that the 
    server would compute one based on how long an admin said a particular 
    resource should be cached. (I.e., the person putting a page on the 
    server would specify "max-age", but the server would compute and send 
    "Expires".)

I think Paul almost got that right.  Let's pretend that the HTTP
spec doesn't include the string "Expires", it includes the string
"Xyzzy" that happens to have the same meaning that we have been
giving to expires.

Then it's fairly easy to see how to resolve this.  Servers provide
a way for admins to specify when a document expires.  This may be
expressed in several different ways:
	(1) "The document expires on Dec 11 1995"
or
	(2) "The document expires 10 days after it was created"
OR
	(3) "Don't let anyone cache this after Dec 11 1995"
or
	(4) "Don't let anyone cache this for more than 10 days"

I think a good server implementation ought to allow all four
ways to describe expiration; although #1 and #2 are almost
equivalent, the rest are quite different.  I couldn't think
of any others, but there probably are some.

Now, assume we have document X that was created on Dec 1 1995,
and on Dec 2, a client requests the document.  Here is what
the server returns for each of the four cases above:
	(1) Xyzzy: Dec 11 1995
	(2) Xyzzy: Dec 11 1995
	(3) Xyzzy: Dec 11 1995
	(4) Xyzzy: Dec 12 1995

And if document X is requested on December 13:
	(1) Not found
	(2) Not found
	(3) Xyzzy: Dec 11 1995
	(4) Xyzzy: Dec 23 1995

Therefore, the Xyzzy mechanism in the HTTP specification is
completely sufficient to implement all four styles of expiration.
Note that it would NOT work for the admin to simply attach an
Xyzzy value to the document, since this would not work for case
4, and would not distinguish between cases 1 and 3.  The server
must compute the Xyzzy value to make this all work.

So now the remaining question is "should we add an Xyzzy: header to the
HTTP spec?"  No way!  "Expires:" not only solves the problem already,
it also is compatible with existing practice, more or less.  To get
back to John's statement that "most Expires headers are set by server
maintainers" ...  I don't know if this is true, but it should not be
true.  Human server maintainers should set a value using one of the
four approaches I listed, and, as Paul said, server software should
compute the Expires: header sent with each response.

-Jeff

P.S.: By the way, a Max-age: header suffers from at least a potential
problem; since it takes non-zero time to retrieve an object,
short Max-age: values might not be accurate.  For example,
if I assign a Max-age: value of 6 hours to a 10 Mbyte object, which is
then retrieved by a client at the wrong end of a 2400 baud modem,
a naive implementation is going to do quite the wrong thing.  If
we do use Max-age, the spec needs to require that the received
value be decreased by the total retrieval time.  (If proxies are
involved, each transfer must decrease the Max-age value.)

Received on Thursday, 7 September 1995 11:57:28 UTC