W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2009

Re: FYI Cache-control deployment

From: Roy T. Fielding <fielding@gbiv.com>
Date: Wed, 25 Nov 2009 13:52:13 -0800
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <60F38E91-8BE2-4CD5-95D1-797C8BE8E15D@gbiv.com>
To: Adrien de Croy <adrien@qbik.com>
On Nov 25, 2009, at 1:27 PM, Adrien de Croy wrote:
> Hi All
> 
> this is just for interest sake.  Part of our load testing we hammer our proxy with a whole bunch of crawlers out onto the 'net.  In the last run we were testing our new cache.  After about a million hits crawling sites, I was wondering why we only had about 200,000 files in cache.  We cache anything with a cache validator (ETag, Last-modified), freshness info (Expires), or appropriate Cache-Control response directives (max-age, s-maxage, public, must-revalidate etc).  It seemed to me the cachability of the net was not great, which limits cache effectiveness.
> 
> So I turned on counting of each different Cache-control header combination we received.  The results were quite interesting.
> 
> * About 70% of responses didn't include a Cache-control header at all

Which means they use the default caching, as intended.

> * Of the remaining 30%, about 80% used the Cache-control header to prevent caching (no-store, private).

Again, that's often intended.

> So only about 7% of sites seem to be using Cache-control to actually specify how to cache something (e.g. specify freshness and revalidation information).  This is quite disappointing.
> 
> There were quite a few sites that sent conflicting directives. The private directive is odd, since there was no authentication going on.

Private is to indicate the cacheable response is not to be
shared even though authentication is not going on.  If auth
were present, there would be no need to indicate private
because that is the default with auth.

> The numbers above are only approximate, if anyone is interested, I can post better / more rigorous results after our next test. 
> It does seem to show on the face of it that
> 
> a) Cache-control isn't well supported in the wild

No, that is not what it means at all.

> b) There's a lot of confusion about Cache-control directives (based on the combinations people choose).

I have no cure for that.  No additional specification will
help those people.  Splitting caching into a separate part might.

....Roy
Received on Wednesday, 25 November 2009 21:52:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:13 GMT