- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Tue, 01 Nov 2011 20:59:33 +0100
- To: Yutaka OIWA <yutaka@oiwa.jp>
- CC: HTTP Working Group <ietf-http-wg@w3.org>
On 2011-11-01 00:25, Yutaka OIWA wrote: > Dear Julian, > thank you for detailed and prompt response. > Please allow me to answer in a random order. > > 2011/11/1 Julian Reschke<julian.reschke@gmx.de>: > >>> No, in XML cases, single quotes and double quotes has been clearly >>> defined to be two representations of the same thing. >>> There is a clear semantic definitions, a definition how to compare these >>> twos, >>> and a definition that anyone should not distinguish between these. >> >> And I believe that we need to make this statement for parameters in HTTP >> header fields as well. > > Hmm, this makes sense. Current "equivalence" rule seems too implicit for me. > >> In P2, we say: >> >> "Many header fields use a format including (case-insensitively) named >> parameters (for instance, Content-Type, defined in Section 6.8 of [Part3]). >> Allowing both unquoted (token) and quoted (quoted-string) syntax for the >> parameter value enables recipients to use existing parser components. When >> allowing both forms, the meaning of a parameter value ought to be >> independent of the syntax used for it (for an example, see the notes on >> parameter handling for media types in Section 2.3 of [Part3])." > > How about rephrasing this to something about: > Many header fields use a format including (case-insensitively) named > parameters (for instance, Content-Type, defined in Section 6.8 of [Part3]). > It should be aware that many existing parser components > do not distinguish unquoted (token) and quoted (quoted-string) syntax > for parameter values. Therefore, > whenever defining a new parameter, the meaning of a parameter value > SHOULD NOT be dependent of the syntax used for it > (for an example, see the notes on parameter handling for media types > in Section 2.3 of [Part3]). > << Receivers are RECOMMENDED to tolerate both forms of parameters > interchangeably.>> (this<< >> may or may not be included) Could you elaborate why you think it needs rephrasing? > Technically speaking, > To enable "recipients to use existing parser components", > the most important thing is the third sentence of the above paragraph. > If we used 1 and "1" in different and defined meanings, > it will break such a parser. > I agree this direction is correct, under the "Postel's principle". Indeed. > # I assume that a token in realm is currently an "undefined behavior", > # allowing receivers to treat it as if it were a string. Undefined means undefined, so recipients essentially can do whatever they want. > Oppositely, Sending side does not need to send out both forms randomly, > and specifications do not need to strongly certify both. > Allowing sender-side to sending string unquoted is a bad idea, > as it always gets complex and is bug-prone when special characters are found > to be sent. If any implementations have a feature correctly quoting any > non-token strings, they can just send a quoted string anytime. So, > My opinion is still to require "normative forms" for the sender's side, > also as par the Postel's principle. > # So, if we had the above<< >> sentence, the "realm "=" quoted-string" > # rule is not a bad thing, I think (now read as the sender's principle). If I understand you correctly, you are arguing that we should leave the ABNF alone and just put additional requirements on the recipient. That's one way to do it, but *if* we require recipients to accept tokens as well it seems to be pointless not to say that in the ABNF. >>> Hmm, I don't agree with this idea, actually. >>> Tokens (meanings usually defined for each token except general integers >>> etc.) >>> has a distinct semantics than strings in general. >> >> I disagree. It's just a syntactical difference. > > Personally still disagreeing, but it is just the matter of definition. > If we explicitly make it clear for HTTP/1.1 BIS and future, > I will follow it for future (including current drafts). > > But one thing to be considered: case insensitivity. > Tokens in RHS are often (not always?) case insensitive, and > strings are mostly (always?) case sensitive. > If we say, for example, "strings to be accepted whenever tokens > are requested", it is OK for me. But I am a bit still hesitate > about saying "tokens are just special cases of unquoted strings." No, tokens aren't always case-insensitive. One example are method names. > e.g. in RFC2617 Digest, the parameter "stale" is explicitly case insensitive. > "qop" and others are implicitly insensitive (by a text in RFC2616, Sec 2.1), > but the naming of LHEX (lower hex?) rule confuses me for nc-value. > >> Do you have an example where this interpretation breaks current >> implementations? > > Personally YES, but because mine is not in the wild, I can change it now :-) > I briefly checked sources of a few Digest implementations, > and all of them works with this change. > (But we may need more interop check for Digest implementations...) > Best regards, Julian
Received on Tuesday, 1 November 2011 20:06:53 UTC