Re: FYI Cache-control deployment

On Nov 25, 2009, at 2:49 PM, Adrien de Croy wrote:
> Roy T. Fielding wrote:
>> On Nov 25, 2009, at 1:27 PM, Adrien de Croy wrote:
>>  
>>> Hi All
>>> 
>>> this is just for interest sake.  Part of our load testing we hammer our proxy with a whole bunch of crawlers out onto the 'net.  In the last run we were testing our new cache.  After about a million hits crawling sites, I was wondering why we only had about 200,000 files in cache.  We cache anything with a cache validator (ETag, Last-modified), freshness info (Expires), or appropriate Cache-Control response directives (max-age, s-maxage, public, must-revalidate etc).  It seemed to me the cachability of the net was not great, which limits cache effectiveness.
>>> 
>>> So I turned on counting of each different Cache-control header combination we received.  The results were quite interesting.
>>> 
>>> * About 70% of responses didn't include a Cache-control header at all
>>>    
>> 
>> Which means they use the default caching, as intended.
>>  
> Or not.  since I'm only getting a 20% strike rate I guess a large proportion of these ones aren't specifying any validators either.
> 
> One thing we don't do is heuristic caching.  Does this mean heuristic caching is the most-used form of caching?

The only thing heuristic about HTTP caching is the TTL.
Almost all implementations do some form of caching by default
when the HTTP message does not say that it can't be cached.

>>> a) Cache-control isn't well supported in the wild   
>> 
>> No, that is not what it means at all.
>>  
> OK, what I should have said is not well used.  I wasn't trying to make a claim about whether the software supports it.  But it is used in a minority of responses, and then mostly to prohibit storage.

It isn't supposed to be used when unnecessary.  Bytes on
the wire are supposed to be avoided.

....Roy

Received on Thursday, 26 November 2009 00:58:08 UTC