- From: Adrien de Croy <adrien@qbik.com>
- Date: Thu, 26 Nov 2009 10:27:54 +1300
- To: HTTP Working Group <ietf-http-wg@w3.org>
Hi All this is just for interest sake. Part of our load testing we hammer our proxy with a whole bunch of crawlers out onto the 'net. In the last run we were testing our new cache. After about a million hits crawling sites, I was wondering why we only had about 200,000 files in cache. We cache anything with a cache validator (ETag, Last-modified), freshness info (Expires), or appropriate Cache-Control response directives (max-age, s-maxage, public, must-revalidate etc). It seemed to me the cachability of the net was not great, which limits cache effectiveness. So I turned on counting of each different Cache-control header combination we received. The results were quite interesting. * About 70% of responses didn't include a Cache-control header at all * Of the remaining 30%, about 80% used the Cache-control header to prevent caching (no-store, private). So only about 7% of sites seem to be using Cache-control to actually specify how to cache something (e.g. specify freshness and revalidation information). This is quite disappointing. There were quite a few sites that sent conflicting directives. The private directive is odd, since there was no authentication going on. The numbers above are only approximate, if anyone is interested, I can post better / more rigorous results after our next test. It does seem to show on the face of it that a) Cache-control isn't well supported in the wild b) There's a lot of confusion about Cache-control directives (based on the combinations people choose). Cheers Adrien -- Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Wednesday, 25 November 2009 21:24:25 UTC