ACTION-845 - Finalize information on caching concept, now live, and contribute it to the list

Hi,
It was my action to update the information on a caching scheme that
seems more effective in terms of maximizing the usage of web caches.
 
The most popular usage of caching commands seems to be not to use any.
Second is probably to prevent all caching whatsoever.
This of course is a bad strategy specifically in a mobile environment.
 
Third, in my estimate, is to use access based caching.
 
A typical command in an Apache configuration would be
 
ExpiresByType image/gif "A86400"

 
With this command .gif images would become stale after 1 day
 
 
Key in this command is the letter "A", which can also be written as
"access"
 
It means that from the time of access the time period set with the
exprires header starts.
Once the counter reaches 0 the content is stale and a new request is
sent to the server.
This is true irrespective if this is a proxy or a browser cache.
 
Sidenote:  
The ever so popular meta element commands or http-equiv elements are not
clearly supported.
Actually, we found them to have no effect at all.
However, even if they are listed in the markup, only a browser would
read it.  All proxies would be ignorant of their existance as proxies do
not read markup, but HTTP headers.
These however, seem not be modified by those commands.
 
To recap, if caching is being done at all, it is usually access based
caching.
 
However, a modificaton based caching has the advantage of requesting a
new resource only and only if the last-modified time stamp is different
from the one of the resource in the respective web cache.
 
This in turn has the advantage of allowing caches, potentially, to
maintain information indefinitely, provided the original resource has
not changed.
One cost is that a HEAD request must be made to the origin server to
enquire if the resource has been modified.
This however is a light weight header and does not cause significant
load, especially when considering the benefits.
 
In order to achieve modification based caching several steps must be
made:
 
A typical command may then be
 
ExpiresByType image/gif "modification plus 6 months"
 
Here we are merely using a different way of writing the command.
Key here is the word "modification".  Also note that the time can be set
much higher.
 
Furthermore the following important commands must be used:
 
proxy-revalidate - forces a proxy to obey all listed caching commands
no-cache - contrary to its name, this causes the browser to send a head
request and validate the time stamp bevor displaying a file.
 
The IE displays odd behavior in that if there is no proxy configured, it
will always request a fresh resource.
This however has no negative effect on whether the rendered resource is
up to date...it will always be.
 
 
Sidenote:
With this configuration the Expires time period looses its meaning.
Once an expires period has passed, max-age number will turn negativ,
showing basically how long the resource has not be modified beyond the
expires period.
Once the resource is modifed, the expires time period starts again and
evtl. will go negativ again.
 
 
 
Using these commands were able to obtain the follwing results:
 
FF and Opera:
 
Correct behavior, irrespective if a proxy was configured or not.

*	response code 200 if a file was new or modified
*	response code 304 if unmodifed.

IE:
Correct behavior, if a proxy is configured.

	
	*	response code 200 if a file was new or modified
	*	response code 304 if unmodifed.

If no proxy is configured you always get a code 200
 
Explanation:  
If a proxy is configured the IE issues an if-modified-since header.
This causes the proxy to validate.  If there is no proxy configured,
this header is missing and file is request from the origin server.
The possibility to use the if-modified-since header to control the
requests was not followed up on, because it was discovered very late and
would have required re-testing.
 
 
One point of caution:  This type of caching allows the usage of
extremely long caching periods.
It will be very difficult to remove falsely fresh resources from caches.
 
-- Kai
 
 

Received on Friday, 12 September 2008 12:35:10 UTC