Re: SHOULD-level requirements in p6-caching

On 07.04.2011 10:19, Mark Nottingham wrote:
>
> I'm starting to survey the SHOULD-level requirements, as per<http://trac.tools.ietf.org/wg/httpbis/trac/ticket/271>, with an initial focus on p6.
>
> I've put some specific comments below, but underlying them are a few premises which I'd like to hear people's opinions on.
>
> * SHOULD-level requirements are to be avoided where possible; MUST is better for interoperability, and
>
> * Where we have SHOULD-level requirements, the exceptional circumstances (as per the 2119 definition of SHOULD) should usually be explicitly enumerated, unless there is an obvious interpretation, and
>
> * SHOULD is sometimes used as a "get-out clause" for cases where there is an expected behaviour, but where other circumstances may require a different behaviour. In principle I'd like to avoid this, as it sends a mixed message. Not sure if it's possible in practice, however. And,

...like when we know that it should be MUST NOT, but implementations are 
known not do this?

> * The added meaning of SHOULD as a metric for "conditional conformance" isn't that useful, isn't defined in 2119, and we should consider dropping it.
>
>
> If we can get agreement to these, a few proposals for p6 follow.
>
> In 2.2:
>
>>     A cache, especially a shared cache, SHOULD use a mechanism, such as NTP
>>     [RFC1305], to synchronize its clock with a reliable external
>>     standard.
>
> Similarly, we later see:
>
>>     A cache SHOULD use NTP ([RFC1305]) or some similar protocol to synchronize its clocks to a globally accurate time standard.
>
> I think these are fine uses of RFC2119 SHOULD language; no need for a change. I.e., an administrator who carefully considers the interoperability issues
> can still run a cache without NTP.

Yes.

> In 2.3.1.1,
>
>>     Also, if the response has a Last-Modified header field (Section 6.6
>>     of [Part4]), a cache SHOULD NOT use a heuristic expiration value that
>>     is more than some fraction of the interval since that time.  A
>>     typical setting of this fraction might be 10%.
>
> This is a pretty meaningless requirement, as heuristic freshness is already a huge get-out clause in HTTP caching; you can just say that you don't base your heuristic on Last-Modified, and choose any value.
>
> I propose reducing this from a requirement to a prose advisory, e.g., "caches are expected to..."
>
> There may be a separate issue on how wide-open heuristic expiry is here.

Generally, +1 on avoiding RFC2119 keywords unless we really need them 
("In particular, they MUST only be used where it is actually required 
for interoperation or to limit behavior which has potential for causing 
harm (e.g., limiting retransmisssions)").

> In 2.3.3,
>
>>     A cache SHOULD NOT return stale responses unless it is disconnected
>>     (i.e., it cannot contact the origin server or otherwise find a
>>     forward path) or otherwise explicitly allowed (e.g., the max-stale
>>     request directive; see Section 3.2.1).
>
> I think this should be a MUST NOT for clarity. Thoughts?
>
>
> Also in 2.3.3,
>
>>     If a cache receives a first-hand response (either an entire response,
>>     or a 304 (Not Modified) response) that it would normally forward to
>>     the requesting client, and the received response is no longer fresh,
>>     the cache SHOULD forward it to the requesting client without adding a
>>     new Warning (but without removing any existing Warning header
>>     fields).
>
>
> This seems like it should be non-normative prose, not a requirement.
>
>
> Right below that,
>
>> A cache SHOULD NOT attempt to validate a response simply
>>     because that response became stale in transit.
>
>
> Prose.

As in "rewrite without RFC2119? Yes.

> In 2.4,
>
>>     When sending such a conditional request, a cache SHOULD add an If-
>>     Modified-Since header field whose value is that of the Last-Modified
>>     header field from the selected (see Section 2.7) stored response, if
>>     available.
>
> Prose.
>
>
> Below that,
>
>>     Additionally, a cache SHOULD add an If-None-Match header field whose
>>     value is that of the ETag header field(s) from all responses stored
>>     for the requested URI, if present.  However, if any of the stored
>>     responses contains only partial content, the cache SHOULD NOT include
>>     its entity-tag in the If-None-Match header field unless the request
>>     is for a range that would be fully satisfied by that stored response.
>
> The first SHOULD would work better as prose, IMO. The second one is more debatable, but if it's a requirement, I'd think it'd be more clear as a MUST.

I think this just explains in how a cache can implement something. It 
would be totally sufficient to have prose explaining what's the best way 
to do here.


> Further down,
>
>>     A full response (i.e., one with a response body) indicates that none
>>     of the stored responses nominated in the conditional request is
>>     suitable.  Instead, a cache SHOULD use the full response to satisfy
>>     the request and MAY replace the stored response.
>
>
> I think this SHOULD is more appropriate as non-normative prose.

We should think about the MAYs as well :-)

> In 2.5,
>
>>     A cache that passes through requests with methods it does not
>>     understand SHOULD invalidate the effective request URI (Section 4.3
>>     of [Part1]).
>
> I'm not sure why this is a SHOULD when all of the other invalidation side effects are MUST-level requirements. Can we raise this to a MUST as well?

If this doesn't result in anything a client can reliably observe then 
I'm not even sure it should be a requirement. Requests can go through 
different paths, so just because one cache invalidates something, it 
doesn't mean all of them do, no...?

> In 3.1,
>
>>     Recipients parsing the Age header field-value SHOULD use an
>>     arithmetic type of at least 31 bits of range.
>
> It seems to me that interop requires a MUST here.

+1.

> In 3.2.1 (only-if-cached),
>
>>        If it receives this
>>        directive, a cache SHOULD either respond using a stored response
>>        that is consistent with the other constraints of the request, or
>>        respond with a 504 (Gateway Timeout) status code.
>
> MUST?

Borderline. It just affects performance, right?

> In 3.2.2 (must-revalidate),
>
>>        A server SHOULD send the must-revalidate directive if and only if
>>        failure to validate a request on the representation could result
>>        in incorrect operation, such as a silently unexecuted financial
>>        transaction.
>
> Prose.

Yes.

> In 3.3,
>
>>     A server SHOULD NOT send Expires dates more than one year in the
>>     future.
>
> Prose.

Yes. BTW: it would be good to explain why this is a problem.

>
> In 3.4,
>
>>     When the no-cache directive is present in a request message, a cache
>>     SHOULD forward the request toward the origin server even if it has a
>>     stored copy of what is being requested.
>
> Prose.

Isn't that a case for MUST?

>> A client SHOULD include both header fields when a no-cache
>>     request is sent to a server not known to be HTTP/1.1 compliant.
>
> Prose.

Yes.

> In 3.5,
>
>>     A server SHOULD include a Vary header field with any cacheable
>>     response that is subject to server-driven negotiation.
>
> I can't decide if this needs to be a requirement; if it does, I think it should be a MUST; if not, it should be prose. Thoughts?
> ...

This is a case where it's the right thing to do but it's not done 
because of the negative effects on some UAs (like some versions of IE 
not even being able to forward to a helper application). Maybe this is a 
case where we need to add a lot more text to explain the problem.

BR, Julian

Received on Friday, 8 April 2011 08:49:59 UTC