- From: Daniel W. Connolly <connolly@beach.w3.org>
- Date: Thu, 18 May 1995 17:04:49 -0400
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
In message <v02110113abe15c3e8ab6@[129.106.201.2]>, Chuck Shotton writes: > Caching >proxies need to step aside when a byte range request is sensed, and this >needs to be treated as a dynamic data request between the client and >server. The range data may be transient, dynamic, and may or may not have >any relationship to a document that can be reconstructed in a file system. >We need to get over the file system paradigm in the HTTP world. This is a >crippled model to be working from at best. Why can't byte ranges be cached? What's the difference between caching requests with ;byterange= in the URI and those without? Why not cache CGI-bin responses? Granted, it's a pain to save their output (though if you do it in parallel with returning the data to the first requestor, everybody wins) and it's a pain to figure out the age of all the data sources that go into a CGI computation. But in theory, there's no reason they can't be cached. w.r.t "transient, dynamic, ..." -- web objects are assumed to be transient and dynamic, no? Where does it say that because you can dereference a URL at 1pm, you should expect to get the same thing (or anything at all!) at 2pm? The Expires header says just this, but if you don't get one, you have to assume the worst, no? A proxy can return a cached response iff it is the same thing that the original server would return, given this query. This means that the cache entry is up to date: (1) the current time is between the Date: and Expires: of the cached response, or (2) the proxy can get a "304 Not Modified" response from the original server at the time of this request. and the cache entry is the best representation of the resource/object: (1) the original server didn't indicate any variant representations (in URI: headers), or (2) the Accept: headers of this request are exactly the same as the Accept: headers that produced the cached response, or (3) the proxy can somehow perform the format negociation calculation locally, without the help of the original server (I think there are circumstances where this can be done reliably, but I'd have to think them through. I think they're sufficiently complex that nobody would ever implement them, though.) I believe some proxies (and some clients) cache more aggressively than this. For example, I heard that the hensa cache doesn't bother with the If-Modified-Since request unless the cache entry is 12 hours or 10% of the lifetime of the document (current time - last-modified). So anything you get through that proxy may be up to 12 hours out of date, and there's no way for you to detect it or prevent it (or does it implement proxy: no-cache?) For some applications, it is probably appropriate to do such heuristic caching. But users should be made aware of this. In fact, I wouldn't mind seeing some Preferences style option: Documents as old as: _ 5 minutes out of date _ 1 hour out of date _ 12 hours out of date are acceptable. Note: an older acceptable age should give lower average latency and the browser would use that option to compute If-Modified-Since: headers for items that it has cached. In fact, another header would be useful for items it doesn't have cached: Acceptable-Age: 600 This header tells a proxy that any cached entry that is less than 600 seconds out of date is acceptable. Proxy: no-cache is equivalent to: Acceptable-Age: 0 The hensa proxy works as if every request had: Acceptable-Age: 43200 This would become a parameter on API calls like: HTLoadAnchor(char *uri, int acceptable_age); Hmmm... perhaps a relative number of seconds isn't such a good idea... with cascading proxies, the time of the request could get skewed. The header should give the aboslute time of the most out-of-date copy that's acceptable: Any-Copy-Since: 19950518153710Z I used this technique in a distributed directory for a fax/email software package: I cached directory entries in a data structure, and allowed say, 10 minute old data. It speeds up performance a LOT: the application might query the data structure 10,000 times over the course of an hour for various sort/query operations. But the application only makes 6 over-the-wire queries during that time. Daniel W. Connolly "We believe in the interconnectedness of all things" Research Technical Staff, MIT/W3C <connolly@w3.org> http://www.w3.org/hypertext/WWW/People/Connolly
Received on Thursday, 18 May 1995 14:09:24 UTC