Re: p1: whitespace in request-target from Roy T. Fielding on 2013-05-19 (ietf-http-wg@w3.org from April to June 2013)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Sun, 19 May 2013 12:58:47 -0700
To: Willy Tarreau <w@1wt.eu>
Cc: Mark Nottingham <mnot@mnot.net>, Amos Jeffries <squid3@treenet.co.nz>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Message-Id: <04F00D4A-09EE-4729-AB44-EAD26522192E@gbiv.com>

On Apr 29, 2013, at 10:52 PM, Willy Tarreau wrote:

> On Tue, Apr 30, 2013 at 01:18:43PM +1000, Mark Nottingham wrote:
>> So, I'm not hearing you say "don't make this a MUST" -- just noting that some
>> broken software out there; correct?
> 
> Amos' last sentence makes me understand "please don't make this a MUST" :
> 
>> The actual security worst-case risk of this undeterminable, but its
>> not going to be good for the transaction at the best of times
> 
> And I also agree with him that the wording currently is ambiguous because
> it can either be understood as "servers and intermediaries should do this
> since they're accepting user-typed URIs" or as "clients should fix user-
> typed requests before sending".
> 
> Thus in order to avoid any ambiguity, I would propose two sentences instead
> of one :
> 
>>  For robustness, software that accepts user-typed URI should attempt
>>  to recognize and strip both delimiters and embedded whitespace.

Which is in RFC3986, not the HTTP specs, and is clearly referring
to user agent software, not Squid.

> Would become :
> 
>    Clients MUST NOT send user-typed delimiters and embedded whitespaces
>    as-is in URIs, and SHOULD either encode them, strip them. Alternatively
>    they MAY simply refuse to perform the request.
> 
>    Servers and intermediaries MUST NOT try to fix embedded spaces and
>    delimiters in URIs, as doing so could lead to interoperability issues
>    and make several components in the chain understand different things.
>    When a request does not parse exactly as defined in the ABNF, an error
>    400 (Bad Request) MUST be returned to the client.
> 
> Comments ?

The text currently in p1 (latest) is

   The request-target identifies the target resource upon which to apply
   the request, as defined in Section 5.3.

   Recipients typically parse the request-line into its component parts
   by splitting on whitespace (see Section 3.5), since no whitespace is
   allowed in the three components.  Unfortunately, some user agents
   fail to properly encode or exclude whitespace found in hypertext
   references, resulting in those disallowed characters being sent in a
   request-target.

   Recipients of an invalid request-line SHOULD respond with either a
   400 (Bad Request) error or a 301 (Moved Permanently) redirect with
   the request-target properly encoded.  Recipients SHOULD NOT attempt
   to autocorrect and then process the request without a redirect, since
   the invalid request-line might be deliberately crafted to bypass
   security filters along the request chain.

I think the requirement is adequately explained.  It is a SHOULD instead
of a MUST because there was no specific error handling defined for this
case in 2616.

....Roy

Received on Sunday, 19 May 2013 19:59:00 UTC