Re: Server Push and Caching from Roy T. Fielding on 2016-09-11 (ietf-http-wg@w3.org from July to September 2016)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Sun, 11 Sep 2016 06:12:14 -0700
To: Mark Nottingham <mnot@mnot.net>
Cc: Tom Bergan <tombergan@chromium.org>, HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <130AA43A-9C83-41DE-9D84-D82F74EF81EC@gbiv.com>
> On Sep 7, 2016, at 5:06 PM, Mark Nottingham <mnot@mnot.net> wrote:
> 
>> On 8 Sep 2016, at 3:22 AM, Roy T. Fielding <fielding@gbiv.com> wrote:
> 
>>>>> Note that HTTP does not put constraints on _how_ the application uses that response after it comes through the API or the cache; it might use it multiple times (e.g., an image might occur more than once on a page, or more than one downstream client might have made the request). It's just that this reuse isn't in the context of a HTTP cache's operation.
>>> 
>>> You're correct that an HTTP *client* isn't required to revalidate a response, but a cache is.
>> 
>> A cache isn't required to revalidate.  Only a client revalidates, and only
>> when it wants to do so.  A cache never makes requests.  A cache is only required
>> to mark the response as stale.
> 
> From previous discussions, I know that's your view, and I think it's internally consistent. I'm less convinced that view is shared by implementations, or even the specs.
> 
> RFC 7234, Section 4: "A cache that does not have a clock available MUST NOT use stored responses without revalidating them upon every use."

Yes, that spec plays loose with the terminology.  Clients make requests.
In some cases, a cache contains a client. In other cases, a cache just
tells the calling client what it currently contains, and is updated as a
side-effect of whatever responses are received.

BTW, I haven't seen any evidence of that requirement enforced, at least for caches
that have an interval timer instead of a wall clock.  They just count the age.

> Section 4.2.4: "A cache MUST NOT generate a stale response if it is prohibited by an explicit in-protocol directive (e.g., by a "no-store" or "no-cache" cache directive, a "must-revalidate" cache-response-directive, or an applicable "s-maxage" or "proxy-revalidate" cache-response-directive; see Section 5.2.2)."

That's inconsistent -- it should be "MUST NOT use a stale response".  The reason
the spec has this requirement is because a cache MAY use a stale response
if it is not prohibited by those explicitly specific directives.

> Section 4.3.2: "When a cache decides to revalidate its own stored responses for a request..."

Should be "When a cache revalidates a stored response ..."

> Section 5.2.2.1: "The "must-revalidate" response directive indicates that once it has become stale, a cache MUST NOT use the response to satisfy subsequent requests without successful validation on the origin server."

That's fine.

> Section 5.5.2:" A cache SHOULD generate this when sending a stale response because an attempt to validate the response failed, due to an inability to reach the server."

Which wrongly assumes that a cache is a server (as does 4.2.4).

> 2616 contains much the same language.

Let's just agree not to go there.

> 
> Cheers,
> 
> --
> Mark Nottingham   https://www.mnot.net/

So, hold for document update?

I don't understand why this is even an argument.  RFC7234 claims that
it is specifying HTTP caching.  We know that caches appear inside all
forms of HTTP components (user agents, proxies, gateways, origin servers)
and in a variety of non-HTTP components (ISPs, captive portals, etc.).
We know that caches are often configured to be less than semantically
transparent.  We cannot seriously define "HTTP cache" to be only those
caches that limit stale reuse to those in section 4.2.4, as if the
components people call a "cache" in real life magically rename themselves
whenever they don't adhere to those requirements.

The reason I stick to my internally consistent views, instead of relying
on the RFC, is because the RFC is not currently capable of describing
how a user agent works with its own cache.  For example, see

  https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching

Likewise, semantic transparency isn't sufficient to encompass the configurations
of any performance-enhancing cache, like Squid and Apache TrafficServer.

I think we should just fix section 4.2.4 to not specify things in client/server
terms (rather, a cache adds to or removes from a stored response) and to allow
for cache behavior regarding stale responses to be based on context and
configuration.

Cheers,

....Roy
Received on Sunday, 11 September 2016 13:12:39 UTC