Re: #320: add advice on defining auth scheme parameters from Julian Reschke on 2011-11-01 (ietf-http-wg@w3.org from October to December 2011)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Tue, 01 Nov 2011 20:59:33 +0100
To: Yutaka OIWA <yutaka@oiwa.jp>
CC: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4EB04FA5.9030600@gmx.de>
On 2011-11-01 00:25, Yutaka OIWA wrote:
> Dear Julian,
> thank you for detailed and prompt response.
> Please allow me to answer in a random order.
>
> 2011/11/1 Julian Reschke<julian.reschke@gmx.de>:
>
>>> No, in XML cases, single quotes and double quotes has been clearly
>>> defined to be two representations of the same thing.
>>> There is a clear semantic definitions, a definition how to compare these
>>> twos,
>>> and a definition that anyone should not distinguish between these.
>>
>> And I believe that we need to make this statement for parameters in HTTP
>> header fields as well.
>
> Hmm, this makes sense.  Current "equivalence" rule seems too implicit for me.
>
>> In P2, we say:
>>
>> "Many header fields use a format including (case-insensitively) named
>> parameters (for instance, Content-Type, defined in Section 6.8 of [Part3]).
>> Allowing both unquoted (token) and quoted (quoted-string) syntax for the
>> parameter value enables recipients to use existing parser components. When
>> allowing both forms, the meaning of a parameter value ought to be
>> independent of the syntax used for it (for an example, see the notes on
>> parameter handling for media types in Section 2.3 of [Part3])."
>
> How about rephrasing this to something about:
>    Many header fields use a format including (case-insensitively) named
>    parameters (for instance, Content-Type, defined in Section 6.8 of [Part3]).
>    It should be aware that many existing parser components
>    do not distinguish unquoted (token) and quoted (quoted-string) syntax
>    for parameter values. Therefore,
>    whenever defining a new parameter, the meaning of a parameter value
>    SHOULD NOT be dependent of the syntax used for it
>    (for an example, see the notes on parameter handling for media types
>    in Section 2.3 of [Part3]).
>    <<  Receivers are RECOMMENDED to tolerate both forms of parameters
>    interchangeably.>>  (this<<  >>  may or may not be included)

Could you elaborate why you think it needs rephrasing?

> Technically speaking,
> To enable "recipients to use existing parser components",
> the most important thing is the third sentence of the above paragraph.
> If we used 1 and "1" in different and defined meanings,
> it will break such a parser.
> I agree this direction is correct, under the "Postel's principle".

Indeed.

> # I assume that a token in realm is currently an "undefined behavior",
> # allowing receivers to treat it as if it were a string.

Undefined means undefined, so recipients essentially can do whatever 
they want.

> Oppositely, Sending side does not need to send out both forms randomly,
> and specifications do not need to strongly certify both.
> Allowing sender-side to sending string unquoted is a bad idea,
> as it always gets complex and is bug-prone when special characters are found
> to be sent.  If any implementations have a feature correctly quoting any
> non-token strings, they can just send a quoted string anytime.  So,
> My opinion is still to require "normative forms" for the sender's side,
> also as par the Postel's principle.
> # So, if we had the above<<  >>  sentence, the "realm "=" quoted-string"
> # rule is not a bad thing, I think (now read as the sender's principle).

If I understand you correctly, you are arguing that we should leave the 
ABNF alone and just put additional requirements on the recipient. That's 
one way to do it, but *if* we require recipients to accept tokens as 
well it seems to be pointless not to say that in the ABNF.

>>> Hmm, I don't agree with this idea, actually.
>>> Tokens (meanings usually defined for each token except general integers
>>> etc.)
>>> has a distinct semantics than strings in general.
>>
>> I disagree. It's just a syntactical difference.
>
> Personally still disagreeing, but it is just the matter of definition.
> If we explicitly make it clear for HTTP/1.1 BIS and future,
> I will follow it for future (including current drafts).
>
> But one thing to be considered: case insensitivity.
> Tokens in RHS are often (not always?) case insensitive, and
> strings are mostly (always?) case sensitive.
> If we say, for example, "strings to be accepted whenever tokens
> are requested", it is OK for me.  But I am a bit still hesitate
> about saying "tokens are just special cases of unquoted strings."

No, tokens aren't always case-insensitive. One example are method names.

> e.g. in RFC2617 Digest, the parameter "stale" is explicitly case insensitive.
> "qop" and others are implicitly insensitive (by a text in RFC2616, Sec 2.1),
> but the naming of LHEX (lower hex?) rule confuses me for nc-value.
>
>> Do you have an example where this interpretation breaks current
>> implementations?
>
> Personally YES, but because mine is not in the wild, I can change it now :-)
> I briefly checked sources of a few Digest implementations,
> and all of them works with this change.
> (But we may need more interop check for Digest implementations...)
>

Best regards, Julian
Received on Tuesday, 1 November 2011 20:06:53 UTC