Re: Cache questions. from Henrik Frystyk Nielsen on 1998-11-20 (www-lib@w3.org from October to December 1998)

From: Henrik Frystyk Nielsen <frystyk@w3.org>
Date: Fri, 20 Nov 1998 14:25:00 -0500
To: olga@goliath.eai.com, www-lib@w3.org
Message-Id: <3.0.5.32.19981120142500.033e2cd0@localhost>
At 18:34 11/19/98 -0600, olga wrote:

>Is that possible that function getting URL would return file pointer to the
>cached object (if it is in cache)? Now the function LoadToFile (if url is
>in cache) copies from cache to file. I just want to read from those files,
and
>there are many of them in my application so the copying is very inefficient.
>
>Also wwwlib allows to get pointer to cached file (using HTCache_find and
>HTCache_name) there seem to be no function to explicitly verify freshness of
>the cached object. The function HTCache_isFresh just returns flag indicating
>that object should be validated (no validation function is provided - at
least I
>did not find one).

This is not how libwww works. Libwww is based on URIs - that is, it can
potentially access anything that has a URI (although in practice, there is
a limited set of access handlers (HTTP, FTP, News, Gopher, WAIS, local
file, and telnet)).

In order to access a URI, the application must create a request object. By
default the request object is created with a often used set of parameters,
but they can all be changed by the caller on a case by case basis, or in
certain situations be applied to all requests.

When a request has been issued, libwww accesses the resource and returns a
response and maybe a stream containing the actual data (at higher levels
this stream can be turned into a memory buffer but this is not required).
For optimization purposes, libwww has a persistent cache that can store
responses for some time. It is important to realize that the cache is not
accessed directly by the application because the cached responses do not
have URIs - they are in a sense not on the Web, the are cached responses of
resources that actually are on the Web.

If set up to use the persistent cache, every time a new request is issued,
libwww checks to see if there already has a valid response and if so
returns this without accessing the resource remotely. By default, the
freshness time of a cached response is a function of the cache parameters
set in the response but that can be overruled by a set of parameters that
can be set in the request object.

In particular, the options can be set by using these function:

	extern void HTRequest_setReloadMode (HTRequest *request, HTReload mode);
	extern HTReload HTRequest_reloadMode (HTRequest *request);

where the mode can be one of:

	typedef enum _HTReload {
	    HT_CACHE_OK             = 0x0,              /* Use any version
available */
	    HT_CACHE_FLUSH_MEM      = 0x1,      /* Reload from file cache or
network */
	    HT_CACHE_VALIDATE       = 0x2,                   /* Validate cache
entry */
	    HT_CACHE_END_VALIDATE   = 0x4,                  /* End to end
validation */
	    HT_CACHE_RANGE_VALIDATE = 0x8,
	    HT_CACHE_FLUSH          = 0x10,                     /* Force full
reload */
	    HT_CACHE_ERROR          = 0x20         /* An error occurred in the
cache */
	} HTReload;

The cache can of course also be set in different ways as defined by

	http://www.w3.org/Library/src/HTCache.html

If you want to use the persistent cache, then you simply turn it on and use
the same URIs as you normally would. If you have to explicitly change the
default behavior, then you can change the behavior by using the flags above.

If these flags are not sufficient then there is also a mechanism for
directly controlling cache-control directive as defined by HTTP/1.1:

	extern BOOL HTRequest_addCacheControl        (HTRequest * request,
	                                              char * token, char *value);
	extern BOOL HTRequest_deleteCacheControlAll  (HTRequest * request);
	extern HTAssocList * HTRequest_cacheControl  (HTRequest * request);

However, it is important to realize that this is an HTTP feature and does
*not* affect the local libwww persistent cache.

>It would be good to have function that allows to read cached object directly
>from cache (and the validation could be done automatically). Also explicit
>validation function would be helpful.
>
>Also in cache's .index file for all entries "must_revalidate" has value 0
>(I guess that "must_revalidate" is the last value for each cached entry in
>.index (?)). In my application I am setting:  
>
>        HTRequest_setReloadMode(req,HT_CACHE_VALIDATE); // for all requests
>        HTCacheMode_setExpires(HT_EXPIRES_AUTO);

In the description, it says that this is for handling the history which is
different from the persistent cache. HTTP/1.1 described in more detail what
the difference is.
    
>But still for cached entries HTCache_isFresh always returns HT_CACHE_OK. 
>I also set: 

You should use the flags that I showed you above but by default, the cache
returns OK, if the cached response is still fresh.

Henrik
--
Henrik Frystyk Nielsen,
World Wide Web Consortium
http://www.w3.org/People/Frystyk
Received on Friday, 20 November 1998 14:25:14 UTC