Re: #231: Considerations for new headers from Julian Reschke on 2011-10-14 (ietf-http-wg@w3.org from October to December 2011)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Fri, 14 Oct 2011 08:56:59 +0200
To: "Manger, James H" <James.H.Manger@team.telstra.com>
CC: httpbis Group <ietf-http-wg@w3.org>
Message-ID: <4E97DD3B.9010300@gmx.de>

On 2011-10-14 03:12, Manger, James H wrote:
>> Proposal:
>>
>> "Many header fields use a format including named parameters (for
>> instance, Content-Type). Parameter values should always allow both
>> unquoted (token) and quoted (quoted-string) values:"
>
>
> What is the benefit of requiring that both syntaxes are supported?
>
> I cannot see how a sender needs the flexibility to send a given parameter in either form. Do apps often pass string values to a generic library or framework that then decides between the token or quoted-string syntax? That seems quite unlikely.
>
> A receiver might prefer to use a generic parser (that accepts token or quoted-string for any parameter value) instead of one that knows the extra restrictions on particular parameters. That parser will still work: it will accept all valid values. The only issue is that the parser can accept invalid values as well (eg accept a token value when a spec says a quoted-string must be sent). That is, the only issue is that the parser might be too lenient. The receiving app could enforce the extra parameter-specific restrictions after the generic parser has passed up the result, though many receivers might not bother (or the generic parser might have discarded the info).

A receiver using a lenient parser will accept all values, and this 
creates an interop issue. If the majority of recipients accepts both 
syntaxes, then, de facto it's required to do so.

It's very unlikely that recipients using a generic parser will reject a 
certain syntax *after* parsing. The simple reason for that is that 
generic parsers usually do not retain the information whether the value 
was quoted or not.

> Is the motivation for this proposal that:
> * Some receivers will accept a value as either a token or quoted-string regardless of the parameter's definition.
> * Being lenient (despite the IETF mantra) can be a security risk.
> * By allowing token and quote-string in a parameter's definition, such receivers are not being lenient so there is less security risk.
>
>
> Or is the motivation that:
> * Some receivers will accept a value as either a token or quoted-string regardless of the parameter's definition.
> * Some senders will send a value as a token even if the definition says quoted-string (and vice versa) because it works with some lenient receivers.
> * Those senders will not interoperate with other (non-lenient) receivers.

Mainly the second, informed by the problems we have seen with existing 
parameters (think Content-Type/charset or WWW-Authenticate/Realm).

> Perhaps a better rule would be that a receiver of named parameters in a header field should be allowed to accept token or quoted-string (or RFC5987) values. Hence, specs should not assign different semantics to these different forms even if the spec requires senders to use one particular syntax.

I don't see the advantage. Special rules per header field or even per 
parameter make it extremely unlikely that recipients will behave 
differently. If they try (by writing custom parsers), the risk of 
getting things wrong actually will be much higher.

Best regards, Julian

Received on Friday, 14 October 2011 06:57:42 UTC