W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2009

Re: httpbis-p6-cache-06 and no-store response directive

From: Adrien de Croy <adrien@qbik.com>
Date: Mon, 08 Jun 2009 15:01:55 +1200
Message-ID: <4A2C7F23.3050409@qbik.com>
To: Mark Nottingham <mnot@mnot.net>
CC: "Yngve N. Pettersen (Developer Opera Software ASA)" <yngve@opera.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>

There is possibly the question about what to do with seemingly 
conflicting cache directives.

Combining no-store with any other directive that implies the result can 
be stored is therefore a contradiction.

If you can't store something, how can you revalidate it?  So any sort of 
revalidation directives imply the result may be stored.

The safe option is to honour the no-store, however that dishonours the 
other directives.

A quick search throws up a few sites expounding the virtues of setting 
no-store, no-cache, pre-check=0 etc on all responses.  Even though 
pre-check seems to be an IE non-standard extension.

Perhaps some wording about which ones should be ignored when they 
conflict could be useful in the spec.

Regards

Adrien



Mark Nottingham wrote:
> Hi Yngve,
>
> I think the your question should be posed as: what compromises the 
> cache subsystem?
>
> In a browser, the HTML parser component would dispatch requests to a 
> cache, which then either satisfies the requests or forwards them. If 
> an HTML page has three references to <http://example.com/foo.gif>, for 
> example, there are two ways of approaching this problem;
>
> 1) asking the cache for foo.gif once (perhaps by piggybacking the 
> callbacks for subsequent images into the first instance), or
>
> 2) asking the cache for foo.gif three times.
>
> In my reading, #1 is conformant even if the response contains 
> no-store; technically (and probably mostly theoretically), #3 is not.
>
> However, I don't think that matters much.
>
> A much greater concern here is that HTTP authors need stable semantics 
> for cache directives. By "creatively" interpreting them, or trying to 
> guess what the authors intend based upon a survey of Web sites, a 
> great disservice is done; authors will need to begin to second guess 
> what cache vendors do when they encounter different directives. This 
> already happens to a degree (although it's not as bad as it used to 
> be, I think, and hopefully HTTPbis will improve the situation a bit 
> more), but no-store has always been one of the more unambiguous 
> directives.
>
> Please don't dilute it. If sites choose to set it, the only rational 
> choice is to assume that they want you to honour it; indeed, there may 
> even be legal implications here (although IANAL). If it makes their 
> site appear slower, well, they're reaping the benefits -- or 
> detriments -- of what they've done.
>
> As far as establishing new directives go, you're absolutely right to 
> characterise this as something "beyond HTTPbis"; assuming you want to 
> replace (rather than augment) the current directives, the only effort 
> that could do this would be HTTP/2.0.
>
> Cheers,
>
>
> On 08/06/2009, at 11:08 AM, Yngve N. Pettersen (Developer Opera 
> Software ASA) wrote:
>
>> Hello all,
>>
>> When reading the p6-06 draft it seemed to me that the new phrasing 
>> seem to forbid client's own cache from using the received response 
>> again under any circumstance, which I think is slightly different 
>> from my interpretation of what RFC 2616 says.
>>
>>
>> RFC 2616 says:
>>
>>  14.9.2
>>
>>   no-store
>>      The purpose of the no-store directive is to prevent the
>>      inadvertent release or retention of sensitive information (for
>>      example, on backup tapes). The no-store directive applies to the
>>      entire message, and MAY be sent either in a response or in a
>>      request. If sent in a request, a cache MUST NOT store any part of
>>      either this request or any response to it. If sent in a response,
>>      a cache MUST NOT store any part of either this response or the
>>      request that elicited it. This directive applies to both non-
>>      shared and shared caches. "MUST NOT store" in this context means
>>      that the cache MUST NOT intentionally store the information in
>>      non-volatile storage, and MUST make a best-effort attempt to
>>      remove the information from volatile storage as promptly as
>>      possible after forwarding it.
>>
>>      Even when this directive is associated with a response, users
>>      might explicitly store such a response outside of the caching
>>      system (e.g., with a "Save As" dialog). History buffers MAY store
>>      such responses as part of their normal operation.
>>
>>      The purpose of this directive is to meet the stated requirements
>>      of certain users and service authors who are concerned about
>>      accidental releases of information via unanticipated accesses to
>>      cache data structures. While the use of this directive might
>>      improve privacy in some cases, we caution that it is NOT in any
>>      way a reliable or sufficient mechanism for ensuring privacy. In
>>      particular, malicious or compromised caches might not recognize or
>>      obey this directive, and communications networks might be
>>      vulnerable to eavesdropping.
>>
>> p6-cache says:
>>
>>  3.2.2
>>
>>   no-store
>>
>>      The no-store response directive indicates that a cache MUST NOT
>>      store any part of either the immediate request or response.  This
>>      directive applies to both non-shared and shared caches.  "MUST NOT
>>      store" in this context means that the cache MUST NOT intentionally
>>      store the information in non-volatile storage, and MUST make a
>>      best-effort attempt to remove the information from volatile
>>      storage as promptly as possible after forwarding it.
>>
>>      This directive is NOT a reliable or sufficient mechanism for
>>      ensuring privacy.  In particular, malicious or compromised caches
>>      might not recognize or obey this directive, and communications
>>      networks may be vulnerable to eavesdropping.
>>
>> To me it seems that the new phrasing seem to forbid the client's own 
>> cache from using the received response even when the resource is 
>> referenced multiple time from the same document, which is common for 
>> some sites using small spacer images or other small icons, or by 
>> multiple documents, like style sheets and images. (The text also 
>> seems to have lost the history reference, though Sec. 4 may make up 
>> for that)
>>
>> I agree that for proxies the requirement to discard immediately make 
>> sense.
>>
>> But for client it is IMO not just a waste of bandwidth (particularly 
>> on performance restricted devices) to reload such resources multiple 
>> times, even for the same document, but it would probably require 
>> significant changes in how clients handle resources. It also 
>> essentially duplicates the no-cache directive in some respects about 
>> reuse, although it does go a little further ("must not reuse"). I'll 
>> remind you of sec 1.1 "Caching would be useless if it did not 
>> significantly improve performance", and the above text will 
>> significantly reduce performance in clients if implemented according 
>> to my current understanding of it, and IMO such a reduction is 
>> unnecessary even from a security perspective.
>>
>> Opera's implementation of this directive since we implemented it has 
>> been "Do not store to filesystem, keep in RAM, discard quickly when 
>> it is no longer in use". Such resources are re-used just like any 
>> other resource in the cache that are not specially treated, like POST 
>> form results, and if necessary re-validated when expired. The only 
>> difference is that they are not written to the disk cache part of our 
>> caching system (this does not prevent virtual memory swapping from 
>> writing them to disk; other measures are being considered for that; 
>> but that problem apply to all use of these data, also for display).
>>
>> Another aspect of this is that quite a lot of sites, as well as the 
>> default configuration of several Wiki packages, in my experience, 
>> automatically send the no-store directive, along with 
>> must-revalidate, even when there is no need for it.
>>
>> A while back MAMA, our structural web search engine, see 
>> http://dev.opera.com/articles/view/mama/ , did a crawl of the Alexa 
>> top million sites and other sites, and while the crawl was still 
>> underway I asked for a list of sites using the no-store directive.
>>
>> The resulting list contained ~300000 unique sites of over 4 million 
>> URLs scanned (total), of which ~50000 (5%) were on the Alexa list, 
>> some of them quite high on the list. As the scan was not complete, 
>> the actual numbers are probably higher.
>>
>> Examples included these URLs (checked early April) :
>>
>>   http://www.mediabox.fr/
>>   http://wiki.mediabox.fr/
>>   http://www.tayloryourevent.com/
>>   http://joomla-wiki.de/doku.php
>>   http://sourceforge.net/
>>   http://technorati.com/
>>   http://secondlife.com/
>>   http://www.alltheweb.com/
>>   http://babynames.com/
>>   
>> http://broadwayworld.com/article/Photo_Coverage_reasons_to_be_pretty_Opening_Night_Celebration_20000101 
>>
>>
>> As you will see, many of these are well known sites, and almost all 
>> of them are front pages, which are unlikely to be sensitive, or 
>> changing very frequently (as in: every few seconds or minutes, and 
>> that could be handled using no-cache).
>>
>> A point about broadwayworld.com : Their *articles* are using
>>
>>    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, 
>> pre-check=0
>>
>> while the *front* pages (which are the dynamic ones) doesn't.
>>
>> Additionally, the default in PHP (at least my copy of v5.0.4) seems 
>> to be "Cache-Control: no-store, no-cache, must-revalidate, 
>> post-check=0, pre-check=0", and I seem to recall that the MoinMoin 
>> wiki had that, too, at least last year (but in that case I may be 
>> misremembering).
>>
>> Given the extensive use of no-store in situations where it does not 
>> seem necessary, I have started wondering if Opera need to start 
>> ignoring the no-store header in non-HTTPS responses, just like we 
>> currently only accept must-revalidate (interpreted as re-validate on 
>> history navigation) only for HTTPS responses. No decision has been 
>> reached yet.
>>
>>
>> My recommendation is that the text describing no-store response 
>> directive is phrased so that all caches are forbidden from storing 
>> the response to non-volatile media, and clear away ASAP after use, 
>> (as it is currently phrased) and that caches that are not part of the 
>> client MUST NOT use the response in when responding to another 
>> request, while allowing *clients* to use their locally stored copy as 
>> long as it can according to other cache policies.
>>
>>
>> Looking forward, past http-bis, given the apparent amount of 
>> misunderstanding about the current cache directives (I receive 
>> regular questions from customers and bug reports claiming that 
>> no-cache means "do not use again", while it only means "revalidate 
>> each time you load the document") I am starting to reach the 
>> conclusion that no-cache, no-store and must-revalidate should be 
>> discarded and replaced with more descriptive names (which should 
>> includes the context of when they are to be used), for example, 
>> on-load-revalidate, sensitive-content-storage, 
>> on-navigate-revalidate, respectively, or words to that effect. If a 
>> must-not-reuse indication is needed, then it should also directly say 
>> so, e.g. single-use-response or unique-response.
>>
>> Also, while only distantly related, as I've pointed out earlier, HTTP 
>> is currently missing a mechanism to let servers invalidate a group of 
>> cache entries, for example during logout.I have suggested such a 
>> cache context mechanism in a draft (the most recent version is 
>> currently expired, but I am planning to refresh it; the most recent 
>> version is available at 
>> http://my.opera.com/yngve/blog/2008/11/06/refreshed-internet-drafts) .
>>
>> -- 
>> Sincerely,
>> Yngve N. Pettersen
>>
>> ********************************************************************
>> Senior Developer                     Email: yngve@opera.com
>> Opera Software ASA                   http://www.opera.com/
>> Phone:  +47 24 16 42 60              Fax:    +47 24 16 40 01
>> ********************************************************************
>>
>
>
> -- 
> Mark Nottingham     http://www.mnot.net/
>
>

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Monday, 8 June 2009 02:59:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:03 GMT