SHOULD-level requirements in p6-caching from Mark Nottingham on 2011-04-07 (ietf-http-wg@w3.org from April to June 2011)

From: Mark Nottingham <mnot@mnot.net>
Date: Thu, 7 Apr 2011 18:19:51 +1000
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <90400372-C89F-4E9C-92F6-D8F1A6AAD631@mnot.net>
I'm starting to survey the SHOULD-level requirements, as per <http://trac.tools.ietf.org/wg/httpbis/trac/ticket/271>, with an initial focus on p6.

I've put some specific comments below, but underlying them are a few premises which I'd like to hear people's opinions on.

* SHOULD-level requirements are to be avoided where possible; MUST is better for interoperability, and

* Where we have SHOULD-level requirements, the exceptional circumstances (as per the 2119 definition of SHOULD) should usually be explicitly enumerated, unless there is an obvious interpretation, and

* SHOULD is sometimes used as a "get-out clause" for cases where there is an expected behaviour, but where other circumstances may require a different behaviour. In principle I'd like to avoid this, as it sends a mixed message. Not sure if it's possible in practice, however. And,

* The added meaning of SHOULD as a metric for "conditional conformance" isn't that useful, isn't defined in 2119, and we should consider dropping it.


If we can get agreement to these, a few proposals for p6 follow.

In 2.2:

>    A cache, especially a shared cache, SHOULD use a mechanism, such as NTP
>    [RFC1305], to synchronize its clock with a reliable external
>    standard.

Similarly, we later see:

>    A cache SHOULD use NTP ([RFC1305]) or some similar protocol to synchronize its clocks to a globally accurate time standard.

I think these are fine uses of RFC2119 SHOULD language; no need for a change. I.e., an administrator who carefully considers the interoperability issues
can still run a cache without NTP.


In 2.3.1.1,

>    Also, if the response has a Last-Modified header field (Section 6.6
>    of [Part4]), a cache SHOULD NOT use a heuristic expiration value that
>    is more than some fraction of the interval since that time.  A
>    typical setting of this fraction might be 10%.

This is a pretty meaningless requirement, as heuristic freshness is already a huge get-out clause in HTTP caching; you can just say that you don't base your heuristic on Last-Modified, and choose any value. 

I propose reducing this from a requirement to a prose advisory, e.g., "caches are expected to..."

There may be a separate issue on how wide-open heuristic expiry is here.


In 2.3.3,

>    A cache SHOULD NOT return stale responses unless it is disconnected
>    (i.e., it cannot contact the origin server or otherwise find a
>    forward path) or otherwise explicitly allowed (e.g., the max-stale
>    request directive; see Section 3.2.1).

I think this should be a MUST NOT for clarity. Thoughts?


Also in 2.3.3,

>    If a cache receives a first-hand response (either an entire response,
>    or a 304 (Not Modified) response) that it would normally forward to
>    the requesting client, and the received response is no longer fresh,
>    the cache SHOULD forward it to the requesting client without adding a
>    new Warning (but without removing any existing Warning header
>    fields).


This seems like it should be non-normative prose, not a requirement.


Right below that,

> A cache SHOULD NOT attempt to validate a response simply
>    because that response became stale in transit.


Prose.


In 2.4,

>    When sending such a conditional request, a cache SHOULD add an If-
>    Modified-Since header field whose value is that of the Last-Modified
>    header field from the selected (see Section 2.7) stored response, if
>    available.

Prose.


Below that,

>    Additionally, a cache SHOULD add an If-None-Match header field whose
>    value is that of the ETag header field(s) from all responses stored
>    for the requested URI, if present.  However, if any of the stored
>    responses contains only partial content, the cache SHOULD NOT include
>    its entity-tag in the If-None-Match header field unless the request
>    is for a range that would be fully satisfied by that stored response.

The first SHOULD would work better as prose, IMO. The second one is more debatable, but if it's a requirement, I'd think it'd be more clear as a MUST.


Further down,

>    A full response (i.e., one with a response body) indicates that none
>    of the stored responses nominated in the conditional request is
>    suitable.  Instead, a cache SHOULD use the full response to satisfy
>    the request and MAY replace the stored response.


I think this SHOULD is more appropriate as non-normative prose. 


In 2.5, 

>    A cache that passes through requests with methods it does not
>    understand SHOULD invalidate the effective request URI (Section 4.3
>    of [Part1]).

I'm not sure why this is a SHOULD when all of the other invalidation side effects are MUST-level requirements. Can we raise this to a MUST as well?


In 3.1, 

>    Recipients parsing the Age header field-value SHOULD use an
>    arithmetic type of at least 31 bits of range.

It seems to me that interop requires a MUST here.


In 3.2.1 (only-if-cached),

>       If it receives this
>       directive, a cache SHOULD either respond using a stored response
>       that is consistent with the other constraints of the request, or
>       respond with a 504 (Gateway Timeout) status code.

MUST?


In 3.2.2 (must-revalidate),

>       A server SHOULD send the must-revalidate directive if and only if
>       failure to validate a request on the representation could result
>       in incorrect operation, such as a silently unexecuted financial
>       transaction.

Prose.


In 3.3,

>    A server SHOULD NOT send Expires dates more than one year in the
>    future.

Prose.


In 3.4,

>    When the no-cache directive is present in a request message, a cache
>    SHOULD forward the request toward the origin server even if it has a
>    stored copy of what is being requested.

Prose.


> A client SHOULD include both header fields when a no-cache
>    request is sent to a server not known to be HTTP/1.1 compliant.

Prose.


In 3.5,

>    A server SHOULD include a Vary header field with any cacheable
>    response that is subject to server-driven negotiation.

I can't decide if this needs to be a requirement; if it does, I think it should be a MUST; if not, it should be prose. Thoughts?

I've omitted a number of Warning-related SHOULDs, as I think they need to be examined separately. 


--
Mark Nottingham   http://www.mnot.net/
Received on Thursday, 7 April 2011 08:20:19 UTC