- From: Andrew Daviel <andrew@vancouver-webpages.com>
- Date: Fri, 16 Aug 1996 10:58:56 -0700 (PDT)
- To: HTTP Working Group <http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com>, ircache list <ircache@nlanr.net>
(Not so much an HTTP 1.1 question, as a general HTTP one) When should one make objects uncacheable? Some cases are clearcut - an object that doesn't change should be cacheable, and if one knows the expiry date one can give it using the Expires header. An object that changes rapidly such as a snapshot of CPU usage should be made uncacheable by giving a current (or illegal) Expires value. I would think that objects which change slowly over time should be given Expires values commensurate with the rate of change, for example a Webcam watching clouds go by might be given a lifetime of 10 minutes. This would allow proxy caches to usefully save a reasonably up-to-date image for popular views. If a document is produced by database lookup instead of from a filesystem, it seems to me that one should generate Last-Modified times and make the object cacheable when there is a database hit. A fish database with 1,000 entries might produce cacheable output when queried such as "/cgi-bin/query?salmon" or "/cgi-bin/query?trout", but not for "/cgi-bin/query?anteater". This would apply to any kind of system creating HTML on-the-fly from (invariant) source. Supposing the above premises are reasonable, my question is whether the output of more general search engines should be made cacheable? It seems to me that certain queries are fairly popular, and that some benefit might be had from cacheing the responses. On the other hand, the number of possible URLs expands exponentially with the length of the query string, so that cacheing every response would be unreasonable, filling up caches with never-to-be-repeated requests. I envisage an algorithm to generate a document lifetime based on number of hits and search terms. If my database is updated once a day, and I have 5 hits for a single search term, it seems reasonable to assign an Expires date of tomorrow. Alternatively I can generate a (correct) Last-Modified header for that search term and allow the cache server to compute an expiry date using its own algorithm. If I have 400 hits from 7 search terms, or no hits, I would give an expiry date of 0. Another possibility is to create a database of actual queries, and use that to generate expiry times, which would allow cacheing responses to often-asked requests for non-existent data. Comments? Currently, Apache 1.1.1 will cache anything with a Last-Modified header, while Squid 1.0.x will not (as shipped) cache anything with a query term. Andrew Daviel andrew@vancouver-webpages.com http://vancouver-webpages.com : home of searchBC
Received on Friday, 16 August 1996 11:01:13 UTC