- From: Yngve N. Pettersen (Developer Opera Software ASA) <yngve@opera.com>
- Date: Mon, 08 Jun 2009 03:08:59 +0200
- To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Hello all, When reading the p6-06 draft it seemed to me that the new phrasing seem to forbid client's own cache from using the received response again under any circumstance, which I think is slightly different from my interpretation of what RFC 2616 says. RFC 2616 says: 14.9.2 no-store The purpose of the no-store directive is to prevent the inadvertent release or retention of sensitive information (for example, on backup tapes). The no-store directive applies to the entire message, and MAY be sent either in a response or in a request. If sent in a request, a cache MUST NOT store any part of either this request or any response to it. If sent in a response, a cache MUST NOT store any part of either this response or the request that elicited it. This directive applies to both non- shared and shared caches. "MUST NOT store" in this context means that the cache MUST NOT intentionally store the information in non-volatile storage, and MUST make a best-effort attempt to remove the information from volatile storage as promptly as possible after forwarding it. Even when this directive is associated with a response, users might explicitly store such a response outside of the caching system (e.g., with a "Save As" dialog). History buffers MAY store such responses as part of their normal operation. The purpose of this directive is to meet the stated requirements of certain users and service authors who are concerned about accidental releases of information via unanticipated accesses to cache data structures. While the use of this directive might improve privacy in some cases, we caution that it is NOT in any way a reliable or sufficient mechanism for ensuring privacy. In particular, malicious or compromised caches might not recognize or obey this directive, and communications networks might be vulnerable to eavesdropping. p6-cache says: 3.2.2 no-store The no-store response directive indicates that a cache MUST NOT store any part of either the immediate request or response. This directive applies to both non-shared and shared caches. "MUST NOT store" in this context means that the cache MUST NOT intentionally store the information in non-volatile storage, and MUST make a best-effort attempt to remove the information from volatile storage as promptly as possible after forwarding it. This directive is NOT a reliable or sufficient mechanism for ensuring privacy. In particular, malicious or compromised caches might not recognize or obey this directive, and communications networks may be vulnerable to eavesdropping. To me it seems that the new phrasing seem to forbid the client's own cache from using the received response even when the resource is referenced multiple time from the same document, which is common for some sites using small spacer images or other small icons, or by multiple documents, like style sheets and images. (The text also seems to have lost the history reference, though Sec. 4 may make up for that) I agree that for proxies the requirement to discard immediately make sense. But for client it is IMO not just a waste of bandwidth (particularly on performance restricted devices) to reload such resources multiple times, even for the same document, but it would probably require significant changes in how clients handle resources. It also essentially duplicates the no-cache directive in some respects about reuse, although it does go a little further ("must not reuse"). I'll remind you of sec 1.1 "Caching would be useless if it did not significantly improve performance", and the above text will significantly reduce performance in clients if implemented according to my current understanding of it, and IMO such a reduction is unnecessary even from a security perspective. Opera's implementation of this directive since we implemented it has been "Do not store to filesystem, keep in RAM, discard quickly when it is no longer in use". Such resources are re-used just like any other resource in the cache that are not specially treated, like POST form results, and if necessary re-validated when expired. The only difference is that they are not written to the disk cache part of our caching system (this does not prevent virtual memory swapping from writing them to disk; other measures are being considered for that; but that problem apply to all use of these data, also for display). Another aspect of this is that quite a lot of sites, as well as the default configuration of several Wiki packages, in my experience, automatically send the no-store directive, along with must-revalidate, even when there is no need for it. A while back MAMA, our structural web search engine, see http://dev.opera.com/articles/view/mama/ , did a crawl of the Alexa top million sites and other sites, and while the crawl was still underway I asked for a list of sites using the no-store directive. The resulting list contained ~300000 unique sites of over 4 million URLs scanned (total), of which ~50000 (5%) were on the Alexa list, some of them quite high on the list. As the scan was not complete, the actual numbers are probably higher. Examples included these URLs (checked early April) : http://www.mediabox.fr/ http://wiki.mediabox.fr/ http://www.tayloryourevent.com/ http://joomla-wiki.de/doku.php http://sourceforge.net/ http://technorati.com/ http://secondlife.com/ http://www.alltheweb.com/ http://babynames.com/ http://broadwayworld.com/article/Photo_Coverage_reasons_to_be_pretty_Opening_Night_Celebration_20000101 As you will see, many of these are well known sites, and almost all of them are front pages, which are unlikely to be sensitive, or changing very frequently (as in: every few seconds or minutes, and that could be handled using no-cache). A point about broadwayworld.com : Their *articles* are using Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 while the *front* pages (which are the dynamic ones) doesn't. Additionally, the default in PHP (at least my copy of v5.0.4) seems to be "Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0", and I seem to recall that the MoinMoin wiki had that, too, at least last year (but in that case I may be misremembering). Given the extensive use of no-store in situations where it does not seem necessary, I have started wondering if Opera need to start ignoring the no-store header in non-HTTPS responses, just like we currently only accept must-revalidate (interpreted as re-validate on history navigation) only for HTTPS responses. No decision has been reached yet. My recommendation is that the text describing no-store response directive is phrased so that all caches are forbidden from storing the response to non-volatile media, and clear away ASAP after use, (as it is currently phrased) and that caches that are not part of the client MUST NOT use the response in when responding to another request, while allowing *clients* to use their locally stored copy as long as it can according to other cache policies. Looking forward, past http-bis, given the apparent amount of misunderstanding about the current cache directives (I receive regular questions from customers and bug reports claiming that no-cache means "do not use again", while it only means "revalidate each time you load the document") I am starting to reach the conclusion that no-cache, no-store and must-revalidate should be discarded and replaced with more descriptive names (which should includes the context of when they are to be used), for example, on-load-revalidate, sensitive-content-storage, on-navigate-revalidate, respectively, or words to that effect. If a must-not-reuse indication is needed, then it should also directly say so, e.g. single-use-response or unique-response. Also, while only distantly related, as I've pointed out earlier, HTTP is currently missing a mechanism to let servers invalidate a group of cache entries, for example during logout.I have suggested such a cache context mechanism in a draft (the most recent version is currently expired, but I am planning to refresh it; the most recent version is available at http://my.opera.com/yngve/blog/2008/11/06/refreshed-internet-drafts) . -- Sincerely, Yngve N. Pettersen ******************************************************************** Senior Developer Email: yngve@opera.com Opera Software ASA http://www.opera.com/ Phone: +47 24 16 42 60 Fax: +47 24 16 40 01 ********************************************************************
Received on Monday, 8 June 2009 01:09:42 UTC