- From: Luigi Rizzo <luigi@labinfo.iet.unipi.it>
- Date: Fri, 5 Jan 1996 14:52:49 +0100 (MET)
- To: http-caching@pa.dec.com
Maybe this does not belong exactly here, but has some relations with the problem of caching. I have (partly) followed the discussion on caching results of GET/POST, and have some opinions on the subject. To start with an example, I would like to point out a typical usage of /cgi-bin at our site, which I believe is not uncommon. Many many times, we use /cgi-bin scripts to do the following: 1) browse through a (possibly small) database, and return the results to the user. Examples include returning the schedule of classes or exams in our faculty: the database is of the order of 500 records, which can be highly compressed. The browsing code is just a few lines written in awk or perl. The typical request returns from one to a few pages of data which, when nicely formatted with tables and anchors, has a size comparable to the database. 2) present a nicely-formatted version of a file. Examples: the occupation status of terminals or classrooms. In these cases, the databases are a few hundred bytes. Again, formatting is done in awk or perl, and the code is very compact. The formatted output, though, produces an anchor and a table entry for every item (which is encoded on one bit) of the original database. As a result, the output may easily become from 100 to 1000 times bigger than the original. Obviously the problem is that the code is run on the server, instead of as close as possible to the client. There are several drawbacks in this approach: * many more bytes than necessary are transferred; * the server is unnecessarily kept busy generating and transferring all the above traffic; * the data is essentially uncacheable because of the variety of possible requests. This is certainly a problem for servers which supply services of the kind I mentioned. Things are going to get worse and worse as WWW services become more widespread and possibly used in wireless, low bandwidth environments [example: queries for all the trains or planes to a given destination from a station or airport]. It is also a problem for caches, which must either give up or develop complex and memory consuming techniques essentially to try to reconstruct the behaviour of the server from its responses. This is both for GET (where no side effects can be assumed, but requests with different parameters possibly yield different results), and POST methods. And there is also a terrible [:)] thing: your cache statistics' are negatively affected by these large, uncacheable items. More seriously, these "uncachable" items might, in many cases, become easily cacheable. I believe this problem should be dealt with *before* trying to develop solutions to cache POST or other dynamically generated data. Maybe somebody is aware of ongoing work on this subject, I would like to know more on it. A (possibly not too hard) way to approach the problem without requiring an upgrade of all clients in the worlds could be the use of an embedded language (JavaScript, awk, perl, tcl, whatever) at least on the caches. The server could then feed the cache with the raw data and the code, for local processing. After this is done, non compliant services would slowly die because intrinsecally less efficient than others: caches could even refuse to cache them! The main question is, how long would it take to reach agreement on a language ? Luigi ==================================================================== Luigi Rizzo Dip. di Ingegneria dell'Informazione email: luigi@iet.unipi.it Universita' di Pisa tel: +39-50-568533 via Diotisalvi 2, 56126 PISA (Italy) fax: +39-50-568522 http://www.iet.unipi.it/~luigi/ ====================================================================
Received on Friday, 5 January 1996 15:38:03 UTC