- From: Yngve N. Pettersen (Developer Opera Software ASA) <yngve@opera.com>
- Date: Mon, 08 Jun 2009 03:08:59 +0200
- To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Hello all,
When reading the p6-06 draft it seemed to me that the new phrasing seem to
forbid client's own cache from using the received response again under any
circumstance, which I think is slightly different from my interpretation
of what RFC 2616 says.
RFC 2616 says:
14.9.2
no-store
The purpose of the no-store directive is to prevent the
inadvertent release or retention of sensitive information (for
example, on backup tapes). The no-store directive applies to the
entire message, and MAY be sent either in a response or in a
request. If sent in a request, a cache MUST NOT store any part of
either this request or any response to it. If sent in a response,
a cache MUST NOT store any part of either this response or the
request that elicited it. This directive applies to both non-
shared and shared caches. "MUST NOT store" in this context means
that the cache MUST NOT intentionally store the information in
non-volatile storage, and MUST make a best-effort attempt to
remove the information from volatile storage as promptly as
possible after forwarding it.
Even when this directive is associated with a response, users
might explicitly store such a response outside of the caching
system (e.g., with a "Save As" dialog). History buffers MAY store
such responses as part of their normal operation.
The purpose of this directive is to meet the stated requirements
of certain users and service authors who are concerned about
accidental releases of information via unanticipated accesses to
cache data structures. While the use of this directive might
improve privacy in some cases, we caution that it is NOT in any
way a reliable or sufficient mechanism for ensuring privacy. In
particular, malicious or compromised caches might not recognize or
obey this directive, and communications networks might be
vulnerable to eavesdropping.
p6-cache says:
3.2.2
no-store
The no-store response directive indicates that a cache MUST NOT
store any part of either the immediate request or response. This
directive applies to both non-shared and shared caches. "MUST NOT
store" in this context means that the cache MUST NOT intentionally
store the information in non-volatile storage, and MUST make a
best-effort attempt to remove the information from volatile
storage as promptly as possible after forwarding it.
This directive is NOT a reliable or sufficient mechanism for
ensuring privacy. In particular, malicious or compromised caches
might not recognize or obey this directive, and communications
networks may be vulnerable to eavesdropping.
To me it seems that the new phrasing seem to forbid the client's own cache
from using the received response even when the resource is referenced
multiple time from the same document, which is common for some sites using
small spacer images or other small icons, or by multiple documents, like
style sheets and images. (The text also seems to have lost the history
reference, though Sec. 4 may make up for that)
I agree that for proxies the requirement to discard immediately make sense.
But for client it is IMO not just a waste of bandwidth (particularly on
performance restricted devices) to reload such resources multiple times,
even for the same document, but it would probably require significant
changes in how clients handle resources. It also essentially duplicates
the no-cache directive in some respects about reuse, although it does go a
little further ("must not reuse"). I'll remind you of sec 1.1 "Caching
would be useless if it did not significantly improve performance", and the
above text will significantly reduce performance in clients if implemented
according to my current understanding of it, and IMO such a reduction is
unnecessary even from a security perspective.
Opera's implementation of this directive since we implemented it has been
"Do not store to filesystem, keep in RAM, discard quickly when it is no
longer in use". Such resources are re-used just like any other resource in
the cache that are not specially treated, like POST form results, and if
necessary re-validated when expired. The only difference is that they are
not written to the disk cache part of our caching system (this does not
prevent virtual memory swapping from writing them to disk; other measures
are being considered for that; but that problem apply to all use of these
data, also for display).
Another aspect of this is that quite a lot of sites, as well as the
default configuration of several Wiki packages, in my experience,
automatically send the no-store directive, along with must-revalidate,
even when there is no need for it.
A while back MAMA, our structural web search engine, see
http://dev.opera.com/articles/view/mama/ , did a crawl of the Alexa top
million sites and other sites, and while the crawl was still underway I
asked for a list of sites using the no-store directive.
The resulting list contained ~300000 unique sites of over 4 million URLs
scanned (total), of which ~50000 (5%) were on the Alexa list, some of them
quite high on the list. As the scan was not complete, the actual numbers
are probably higher.
Examples included these URLs (checked early April) :
http://www.mediabox.fr/
http://wiki.mediabox.fr/
http://www.tayloryourevent.com/
http://joomla-wiki.de/doku.php
http://sourceforge.net/
http://technorati.com/
http://secondlife.com/
http://www.alltheweb.com/
http://babynames.com/
http://broadwayworld.com/article/Photo_Coverage_reasons_to_be_pretty_Opening_Night_Celebration_20000101
As you will see, many of these are well known sites, and almost all of
them are front pages, which are unlikely to be sensitive, or changing very
frequently (as in: every few seconds or minutes, and that could be handled
using no-cache).
A point about broadwayworld.com : Their *articles* are using
Cache-Control: no-store, no-cache, must-revalidate, post-check=0,
pre-check=0
while the *front* pages (which are the dynamic ones) doesn't.
Additionally, the default in PHP (at least my copy of v5.0.4) seems to be
"Cache-Control: no-store, no-cache, must-revalidate, post-check=0,
pre-check=0", and I seem to recall that the MoinMoin wiki had that, too,
at least last year (but in that case I may be misremembering).
Given the extensive use of no-store in situations where it does not seem
necessary, I have started wondering if Opera need to start ignoring the
no-store header in non-HTTPS responses, just like we currently only accept
must-revalidate (interpreted as re-validate on history navigation) only
for HTTPS responses. No decision has been reached yet.
My recommendation is that the text describing no-store response directive
is phrased so that all caches are forbidden from storing the response to
non-volatile media, and clear away ASAP after use, (as it is currently
phrased) and that caches that are not part of the client MUST NOT use the
response in when responding to another request, while allowing *clients*
to use their locally stored copy as long as it can according to other
cache policies.
Looking forward, past http-bis, given the apparent amount of
misunderstanding about the current cache directives (I receive regular
questions from customers and bug reports claiming that no-cache means "do
not use again", while it only means "revalidate each time you load the
document") I am starting to reach the conclusion that no-cache, no-store
and must-revalidate should be discarded and replaced with more descriptive
names (which should includes the context of when they are to be used), for
example, on-load-revalidate, sensitive-content-storage,
on-navigate-revalidate, respectively, or words to that effect. If a
must-not-reuse indication is needed, then it should also directly say so,
e.g. single-use-response or unique-response.
Also, while only distantly related, as I've pointed out earlier, HTTP is
currently missing a mechanism to let servers invalidate a group of cache
entries, for example during logout.I have suggested such a cache context
mechanism in a draft (the most recent version is currently expired, but I
am planning to refresh it; the most recent version is available at
http://my.opera.com/yngve/blog/2008/11/06/refreshed-internet-drafts) .
--
Sincerely,
Yngve N. Pettersen
********************************************************************
Senior Developer Email: yngve@opera.com
Opera Software ASA http://www.opera.com/
Phone: +47 24 16 42 60 Fax: +47 24 16 40 01
********************************************************************
Received on Monday, 8 June 2009 01:09:42 UTC