- From: James Gwertzman <gwertzma@eecs.harvard.edu>
- Date: Mon, 29 May 1995 17:38:55 -0400
- To: sjk@amazon.com
- Cc: www-talk@www10.w3.org
Hi. Let me respond to your points one by one. >>>>> "Shel" == Shel Kaphan <sjk@amazon.com> writes: Shel> Hi, Shel> The "expires" feature should cover the issue of when pages Shel> should be flushed, but the world is apparently not ready for Shel> it, because: Shel> - If you set documents to expire immediately, some major Shel> browsers display "Data Missing" or equivalently scary Shel> messages when you use browser commands to "back up" to that Shel> page. Since many users are not going to understand what is Shel> going on and will be confused by such messages, and may not Shel> know to "reload" the page at that point, it would be better Shel> for them never to see messages like that. (I've already had Shel> problems with some naive beta testers tripping over that. Shel> They tend to think something must have broken. You can't Shel> argue that we need more sophisticated users, because we Shel> don't have a choice!) Shel> - Some browsers (such as Prodigy's) appear to ignore the Shel> "expires" header and cache pages anyway. (and that's just Shel> their *browser*...) In my mind the expires field should ONLY be used for documents with a fixed lifetime. Cool-site-of-the-day for example, or dynamic pages which expire immediately. I agree that browsers should do a better job with pages that expire immediatly; namely showing them but not caching them. I believe that for all other items (with undetermined lifetimes) thath the browsers should use the technique that I describe in the chapter of my thesis labeled "Cache consistency" that is based on the Alex FTP cache. Namely, the older a page is the less likely that the page will change. when the browser suspects that the page might have changed it sends the "get-if-changed-since" message to the server to find out whether its cached replica needs to be updated. If the answer is "yes" then it updates the page before showing it to the user. Otherwise it simply uses the page currently cached. the Browser decides when to check by using ratio of the time since the file was last checked to the age of the file (time since file was created). Whenever this ratio exceeds some threshold, ie 10%, the file is checked. In other words, if the file is a month old, and it was last checked an hour ago, don't bother checking again before using the cached copy. If the file was created a month ago, and last checked a week ago, then contact the server before showing the user the cached file. I describe simulations in my thesis that show this to be a promising approach. Shel> So, I have a question and I have suggestions. Shel> First, the question: Shel> Is there any good workaround for the current problem, that Shel> would have the properties of: - forcing browsers to reload Shel> expired pages when someone explicitly requests one, and - Shel> either: - allowing pages on the browser's history stack (for Shel> instance) to remain in the local cache even if they are Shel> expired, or, - *somehow* causing the browsers to gracefully Shel> and silently reload expired pages when re-visited through Shel> history mechanisms. Shel> No? I suspected as much... You're right, my stuff does not address the "here and now" very well. I'm describing a solution to caching on local-area-networks, not necessarily clients and their history stacks. Shel> The suggestions: Shel> To make the web work more smoothly, it would be nice if Shel> browsers would handle this situation more gracefully, by, Shel> for instance, not displaying errors like "Data Missing", but Shel> just automatically reloading the page. Shel> However, I also think it is worth considering for browser Shel> writers that history stacks (that can be re-viewed with Shel> browser navigation controls) are in a class of their own Shel> when it comes to caching. However, while it might make Shel> sense to back up and see an expired document, since history Shel> mechanisms are for "history", it does not make sense to go Shel> through a link and see a cached copy of an expired document. Shel> It is REALLY BAD for browsers to display cached copies of Shel> expired documents when they are meant to be freshly Shel> displayed in response to a direct user command, because a Shel> URL may be a request to a program that is displaying dynamic Shel> information related to the user's extended "session" with Shel> the server. (This is the core of the issue). Shel> I realize these considerations may have no role in the HTTP Shel> spec, however I feel there are serious problems in this Shel> area, which can only be resolved by coordinating the Shel> behavior of browsers and servers. Shel> Another thing that might help: perhaps there should be a way Shel> for servers to "force" the URL (the *name*) handled by Shel> clients to something other than the requested URL. This Shel> would allow, for example, the requestor's URL to be used to Shel> encode information relating to a query, but would then Shel> result in a single cache entry in the client. Shel> To explain this a little more, if there were two GET Shel> requests, one for /cgi-bin/food/hamburgers and one for Shel> /cgi-bin/food/french-fries, which would result in a single Shel> page that ought to be cached as one page, then the server Shel> ought to be able to say, "you asked for /food/french-fries, Shel> but the page is called /food/generic-junk-food", and to have Shel> the browser use that info to uniquely identify a cache entry Shel> and update it with the newly fetched data. This might not Shel> help to avoid fetching documents extra times, but it would Shel> help on cache coherence if the intent was to display a Shel> dynamically generated document. I agree here. There is already a redirection mechanism in place, but I don't think the results of the redirection are cached across sessions. I would love it if the user could ask for page a on machine b, and be told that page a now lives on machine c, and remember that fact until told otherwise. after all, a redirection like this only takes 30 or 40 bytes, and the typical client could store thousands of them very neatly. Shel> Anyway, just some thoughts. If you have any ideas, pointers Shel> or references for me, I would really appreciate it. Shel> --Shel Kaphan sjk@amazon.com
Received on Monday, 29 May 1995 17:57:20 UTC