RE: NEW ISSUE: 13.1.2's Definition of 1xx Warn-Codes from Travis Snoozy (Volt) on 2006-12-27 (ietf-http-wg@w3.org from October to December 2006)

From: Travis Snoozy (Volt) <a-travis@microsoft.com>
Date: Wed, 27 Dec 2006 13:46:26 -0800
To: Joris Dobbelsteen <Joris@familiedobbelsteen.nl>
CC: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <86EDC3963F04D546BED8996F77D290F6049D117BA1@NA-EXMSG-C138.redmond.corp.microsoft>
I apologize in advance for the long-windedness of this letter; you might
want to get a cup of tea before you start reading.

Joris Dobbelsteen said:
> Assumption: RFC2616...

Correct. Sorry for not being explicit.

> >(Supporting references at the end of the message)
> >
> >The offending part of section 13.1.2 (Warnings, page 77) reads:
> >
> >   [...]
> >
> >   1xx  Warnings that describe the freshness or revalidation status of
> >     the response, and so MUST be deleted after a successful
> >     revalidation. 1XX [sic] warn-codes MAY be generated by a
> >     cache only when validating a cached entry. It MUST NOT be generated
> >     by clients.
> >
> >   [...]
> >
> >
> >The problems, in order from simplest to the most complex:
> >

<snip>

> >2. The use of MAY in the offending part of 13.1.2 conflicts
> >with the MUST requirements in section 14.46 and the definition of MAY in
> >BCP14.
> >
>
> Invalid.
> Your interpretation is clearly wrong, read it in another way. It is
> consistent. I will comment on the fix.

Then my point is proven: if I can interpret it wrong, it needs to be
rewritten ;) (see also Murphy's Law).

But seriously, according to BCP14: MAY "mean[s] that an item is TRULY
OPTIONAL" (emphasis mine). Clearly, it is not truly optional for 1xx warn-
codes to "be generated by a cache only when validating a cached entry."
Certain 1xx warn-codes MUST be generated by any conformant/conditionally
conformant cache (section 14.46, page 149). May is being used correctly in
the English sense, but not in the BCP14 MAY sense.

Now, as to whether my interpretation is right or wrong, *that* is arguable.
I have some examples (later on) that show the completely literal
interpretations of the spec. I did _not_ use the literal interpretation of
the original text, since it seemed very unlikely that that was the actual
intent. However, the literal interpretation *does not* have any conflict
with section 14.46. In any case, I ask that you tell me how -you- interpret
this section; telling me to "read it another way" is totally unhelpful.

> >Proposed fix: "1xx warn-codes MUST NOT be added to any
> >messages except cache entries
>
> Invalid:
> Warnings, including 1xx warn-codes, have multiple purposes and are not
> limited to HTTP caches only (see section 14.46).
> Furthermore its perfectly legal for a server to include a 1xx warn-code
> on, perhaps my database refuses and I used the cached information
> instead. A warning could be appropiate in some cases.

1xx codes are almost exclusively cache-related, so I'd say it probably won't
matter in practice (only 110 and 199 make sense for anything but a cache to
generate, and even 110 is iffy, and the warn-code could easily be generated
by the origin server's "caching" subsystem). My new proposal (later) should
sidestep this issue, so it shouldn't really matter either way -- but it's
good that you thought about it (I don't think I did).

> >, and MUST NOT be added to cache entries except in response to a
> >validation attempt." (As a side note, a definition of a cache entry would
> > be nice.)
>
> Valid, but the spec already says this (in the positive sence).

13.1.2 does not actually say this in a positive sense (though -other-
portions of the spec might). What it says is something completely different,
and (I'm guessing) unintentional.

The technical reason that MUST NOT (or a well-crafted MUST) needs to be used
has to do with "may" in English, versus MAY in BCP14. BCP14 effectively
requires authors to emulate the English meaning of "may" in the form of a
MUST or MUST NOT. Following are several examples, starting from the original
wording and moving to my proposed wording.


BAD (original)

"1xx warn-codes MAY be generated by a cache only when validating a cached
entry."

Stated positively, but English and BCP14 conspire to contort the meaning.
This literally means: "A cache can choose to perform 1xx warning generation
for any message [though the cache entry being validated is implied] when
validating a cache entry, and at no other time. A cache can also choose to
generate 1xx warnings for any entry at any other time -- it's totally up to
the implementer how they want to do it." I think that it is *highly
unlikely* that the spec actually means this (but please, speak up if
you think the spec means what it says, here).


BAD

"1xx warn-codes may be generated by a cache only when validating a cached
entry."

Stated positively, but with no requirements (which may, in fact, be all
right). This literally means: "If a cache is validating a cache entry, then
the cache can generate 1xx warnings for any message [but the cache entry
being validated is implied]. This is informative, and any normative sections
that conflict with this statement override."


BAD

"1xx warn-codes MUST be generated by a cache only when validating a cached
entry."

Stated positively, but English and BCP14 are still messing things up.
Literally means: "When validating a cached entry, the cache has to generate
1xx warn-codes for some message or another [though the entry being validated
is implied] -- if it doesn't, it's in violation of the spec."


BAD

"1xx warn-codes MUST only be generated by a cache when validating a cached
entry."

Stated positively, but still screwy. Literally: "Nothing but caches are
allowed to generate 1xx warn-codes, and the caches are allowed to generate
1xx warn-codes for any message if and only if the cache is validating a
cached entry [but the cache entry being validated is the implied target for
the generated warn-code]. Breaking these rules means you're in violation of the spec."


BETTER

"Caches MUST ensure that they generate 1xx warn-codes for cached entries
only, and that 1xx warn-codes are generated for a cached entry during
validation of that entry only."

Stated positively, with the implicit target of the warn-code made explicit.
However, I prefer to use "MUST NOT ... except" instead of "MUST ... only",
because even (native) English-speakers have difficulty using and
understanding the word "only" in a correct, unambiguous way. Since being
correct and unambiguous is especially important in a spec, avoiding "only"
is probably a Good Thing.


BETTER (new proposed)

"A cache MUST NOT generate 1xx warn-codes for any messages except cache
entries, and MUST NOT add 1xx warn-codes to cache entries except in response
to a validation attempt."

Stated negatively, with the warn-code target made explicit. Correct,
unambiguous, and easy to understand. This is my suggestion, as revised to
allow servers to generate warn-codes (by virtue of specifying only cache
behavior).


> But your proposed fix forgets that 1xx warn-codes must be deleted on
> successful validation.

Sorry, I wasn't totally clear about my intent: I'm not replacing the
*entire* paragraph, just the last two sentences. There are no issues with
the first sentence. My mistake.

> (Side note: Tricky is that section 14.46 does mention that warnings from
> the validated response must be used instead. So remove the old 1xx
> warnings, as these do not apply any more and add the new warnings
> instead. Of course these new warnings may include 1xx warn-codes again,
> but for this case.)

I don't think I understand what your point is, here.

> I do not suggest any corrections to this paragraph for this reason.

Well, I still do; hopefully I've convinced you that what this section is
actually saying is probably not what is really meant.

> >3. One would think that proxies could include caches (though I
> >have yet to find where this is permitted with a true BCP14 MAY).
>
> Read the defitions:

Yes, lets!

> Client is a program.
Informative.

> Server is a program.
Informative.

> Proxy is a program.
Informative.

> Cache is a program's local store and ....
Informative.

> Thus a cache may be a proxies local store and ....
Informative.

You left out the *only* normative bit in the definitions (which,
consequently, is the one that makes it so that proxies can't have caches):

* Proxies MUST conform to both server and client requirements
Normative (!)

> Funny you say it correctly below.

It's not funny, it's intentional ;). Proxies MUST conform to both server and
client requirements. Clients MUST NOT generate 1xx warn-codes. Caches MUST
generate 110, 111, and 113 warn-codes in certain situations. Thus, a client
MUST NOT have a cache (because if it did, the client would generate warn-
codes in the cache, violating the spec), and by extension a proxy MUST NOT
have a cache (because it would be violating a client rule if it did). This
is clearly not what the spec writers intended, but it's what they wrote
nonetheless.

> >However, the wording in the offending part of 13.1.2 makes it impossible
> >to satisfy the requirements of a cache and the requirements of a client
> >(and, by extension, proxy) simultaneously. A cache is not an independent
> >program; it is part of a program (as per the 1.3 definition). A client is
> >a program, and it can contain a cache (again, from 1.3), but this limits
> >the cache's behavior to the set intersection of allowed behaviors for
> >caches and clients (due to how "client" is defined). This leads to a
> >conflict where the cache MUST generate a 1xx code, but a
>
> ...cache "MAY" generate..., not MUST.
> You mentioned that in the above items.

Please re-read the references I included in my original message. Section
14.46 page 149, definitions of warn-codes 110, 111, and 113 clearly state
that a cache MUST generate these warn codes under certain conditions. That's
the whole point, here; the MAY in section 13.1.2 can not be correct, and
needs to be removed. The cache MUST generate specific 1xx warn codes, but
only under certain conditions; English "may" means this, but BCP14 MAY does
NOT.

> >client MUST NOT generate a 1xx code. Thus, we're left having to conclude
> >that caches can exist only as part of independent servers (which have
> >their content pushed to them, or delivered through some out-of-band
> >method).
>
> Thus, invalid.

That, too is the point. The conclusion drawn from the conformance points in the spec leads to utter absurdity. Thus, one of the following must be true:

1. I botched my logic somewhere.
2. The spec really intends this silly conclusion.
3. There's an error in the spec.

I argue that (3) is the case. I think that we can both agree that (2) is
very, very unlikely. I also think that (1) will be very hard to show,
because the spec is quite clearly contradicting itself -- but if you can
point me to a clause that breaks my logic (or show that I've made an error
in my logic, or used a false premise), I'll be happy to re-evaluate my
position.

> Your claim fails on wrong copying of the MUST and MAY clauses in the
> spec.

Are you asking me to copy, verbatim, the entire sections I'm referring to? I
have not taken anything out of context that I am aware of, and you are free
to double-check with the official RFC if you wish (I have included sections
and page numbers for just that purpose). If you think I have taken something
out of context ("wrong copying"), please be more explicit as to the
sections, pages, and clauses that you feel I have wrongfully omitted.
However, I am confident that even if I were to use the complete text of the
sections I referred to, my argument would still hold (since I'm working on a
heavily-annotated hard-copy of the spec, it's hard to take things out of
context).

> Only a slight typo correction might be in place.

You mean MAY -> may, right? ;)


Thanks for your analysis,

-- Travis
Received on Wednesday, 27 December 2006 21:46:39 UTC