RE: Query string cacheability from Eric Lawrence on 2010-05-19 (ietf-http-wg@w3.org from April to June 2010)

From: Eric Lawrence <ericlaw@exchange.microsoft.com>
Date: Wed, 19 May 2010 19:10:03 +0000
To: Julian Reschke <julian.reschke@gmx.de>, Mark Nottingham <mnot@mnot.net>
CC: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <479CAD406474484E8FA0E39E694732C0083298@DF-M14-03.exchange.corp.microsoft.com>

FWIW, my research shows that most current version browsers do not meet this requirement.  For a resource which contains no explicit lifetime information but whose URL contains a query string:

- Firefox will conditionally request/revalidate for LINK HREF (e.g. CSS) and SCRIPT SRC tags. For IMG tags, Firefox appears to revalidate the resource only once per browser session.

- Internet Explorer 8 and below revalidate such resources once per browser session, regardless of the context in which the resource is used.

- Chrome and Opera appear to ignore the query string and reuse the cached resource without validation, both during navigation and across browser restarts.

- Safari for Windows does not revalidate a page's resources during hyperlink navigation. However, it does not appear to cache heuristically cacheable content across multiple browser sessions. Safari 4.0.5 always appears to unconditionally re-request the direct target of a navigation, regardless of whether or not the resource was delivered with headers indicating it was still fresh.

Eric Lawrence
IE Program Management

-----Original Message-----
From: ietf-http-wg-request@w3.org [mailto:ietf-http-wg-request@w3.org] On Behalf Of Julian Reschke
Sent: Wednesday, May 19, 2010 6:49 AM
To: Mark Nottingham
Cc: HTTP Working Group
Subject: Re: Query string cacheability

On 19.05.2010 14:31, Mark Nottingham wrote:
> One of the things that I did in the big caching rewrite was to remove the text about the effect of query strings on cacheability:
>
>> Section 13.9
> [...]
>>
>>     We note one exception to this rule: since some applications have
>>     traditionally used GETs and HEADs with query URLs (those containing a
>>     "?" in the rel_path part) to perform operations with significant side
>>     effects, caches MUST NOT treat responses to such URIs as fresh unless
>>     the server provides an explicit expiration time. This specifically
>>     means that responses from HTTP/1.0 servers for such URIs SHOULD NOT
>>     be taken from a cache.
>
> replacing it with, in p6 2.3.1.1:
>
>>     [[REVIEW-query-string-heuristics: took away HTTP/1.0 query string
>>     heuristic uncacheability.]]
>
> Looking at this with somewhat fresh (but also a bit sleepy) eyes, I think we can re-introduce this text, but wonder if we need the last sentence; it's somewhat of a non-sequitor, AFAICT, since RFC1945 had Expires to determine an explicit expiration time, and anyway it should probably say "origin server," which as discussed before is sometimes difficult to tell, given the lack of Via support in many intermediaries.
>
> I propose we address this by changing the beginning of 2.3.1.1 to:
>
> """
>     If no explicit expiration time is present in a stored response that
>     has a status code of 200, 203, 206, 300, 301 or 410, a heuristic
>     expiration time can be calculated.  Heuristics MUST NOT be used for
>     other response status codes.
>
>     Also, heuristic freshness MUST NOT be used for responses
>     to requests with a query component, because
>     some applications have traditionally used queries on URLs to
>     perform operations with significant side effects.
>
>     [ remaining paragraphs as in -09]
> """
>
> Thoughts?

Sounds good to me.

Maybe replace

"some applications have traditionally used queries on URLs to perform operations with significant side effects"

with

"some historic, non-compliant applications have implemented non-safe operations in this case"

(points being: what's in error is the server, and it always has been a compliance issue, no?)

Best regards, Julian

Received on Wednesday, 19 May 2010 19:10:38 UTC