- From: Larry Masinter <masinter@parc.xerox.com>
- Date: Tue, 3 Sep 1996 22:04:50 PDT
- To: peterson@austin.ibm.com
- CC: fielding@avron.ics.uci.edu, frystyk@w3.org, hallam@w3.org, jg@w3.org, mogul@pa.dec.com, timbl@hq.lcs.mit.edu, www-talk@w3.org
(http-wg@cuckoo.hpl.hp.com always works for me, and I've gotten mail on it.) It would be very useful to gather sufficient data to tell if it would be useful to have callbacks for update in HTTP, and also to see some experiments with actual implementation. In lieu of any data, though, it sounds like there's some skepticism about taking up such a proposal. It does seem, what with all of the proposals for afs: URLs around, that there are situations where those style systems are useful. I can imagine several schemes, but, as people have pointed out, there are lots of details to work out. A minor thought: Could there some kind of 'accept-redirect' header which means "I'm willing to accept a redirect to an 'afs:' URL"? I mean, why reinvent AFS when you can just use it? Larry ================================================================ Date: Tue, 20 Aug 1996 17:35:43 -0500 From: peterson@austin.ibm.com (James L. Peterson) Message-Id: <9608202235.AA24661@lyle.austin.ibm.com> To: http-wg@cuckoo.hpl.hp.com Subject: Caching and callbacks We are trying to implement caching of Web pages in regional proxies. The current scheme, as I understand it, requires that every reference to a page be checked to see if it is out-of-date. Since a page may change at any time (but probably won't), we assume that a reference by a client to the page in the proxy will require the proxy to check with the server to see if the page is out of date (either with a HEAD request or a GET if-modified-since request). It appears to me that this is very similar to the NFS file system design. Our initial work on the Andrew File System found that for both NFS and the initial AFS, that the vast majority of messages is checking to see if something is out-of-date when it isn't (for a file system this was a stat() request). My memory suggests that 85% to 95% of the messages were stat requests. There are two disadvantages here: (1) the network traffic, and (2) the latency until the client gets the correct information. If the file does need to be updated, the round-trip time from the proxy to the server is unavoidable, but in the vast majority of the cases, the file is not changed and the round-trip time is wasted. The solution for the Andrew File System was to redesign it to be based on callbacks. A user of a file caches it locally and registers with the server for a callback. If, or when, the file is changed at the server, the server then sends messages to all registered clients indicating that their local copy is out-of-date. The local client can then either retrieve an up-to-date copy, or simply delete the old copy from its cache. We would like to propose that a similar callback scheme be allowed in web pages using http. A new request, or modification of the GET request, would be provided which asks the server for an object and also asks to be notified if the object changes. The serve may respond with the object and a notification of "no callback support" (the current situation) or may accept the callback request. The server maintains a list of callbacks for each page. If it finds that the page has changed, it then notifies each callback with a message containing the URL of the page that changed. The client can then either update its local cached copy or simply throw it away. Our objective would be to use this for caching in proxies. The client would request a page from the proxy. If the proxy has the page, it would return it. If not it would request the page from the server, and request callback if it changed. Further requests for the page would return the proxy cached copy. If the page changes, the server would notify the proxy, who would invalidate its local cached copy. There are a number of issues. The server would accept a callback request for one change. When a change occurs, it would step through its list and notify each requester, removing them from its callback list. If the requester wants to be notified of future changes, it would need to request callback again (presumably when it fetched the updated page). It would be expected that a number of the requesters may no longer have that page in their cache and would simply ignore the notification of change and not re-register. If the list of callbacks at the server becomes too long, it has several options: (1) refuse new requests for callback, (2) remove the oldest request from the list (LRU) and send it a message indicating that its callback has been canceled, (3) grow the list to accept the new callback. The one major flaw that I see is that the server may fail and the callback list may then be lost. In this case the requesters may be expecting to be notified when, in fact, they have been dropped from the callback list (this is another option for what to do if the callback list gets too long -- just throw some entries away but we did not propose it since it seems to invalidate the point of the callback list). Accordingly, it would seem that the requester will have to, at intervals, check if all three of the following are true: (a) the server and page still exists, (b) the page has not been modified since it was cached, and (c) we are still on the callback list. jim ================================================================
Received on Thursday, 5 September 1996 12:49:18 UTC