Re: rewritten section on message body and length from Roy T. Fielding on 2010-08-10 (ietf-http-wg@w3.org from July to September 2010)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Tue, 10 Aug 2010 16:29:31 -0700
To: Willy Tarreau <w@1wt.eu>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <6B804D13-15BC-4671-A914-630D0A65B7C1@gbiv.com>
On Jul 27, 2010, at 10:50 PM, Willy Tarreau wrote:

> On Tue, Jul 27, 2010 at 08:52:39PM -0700, Roy T. Fielding wrote:
>> As part of the changes for draft 11, I merged the misnamed section on
>> message length into the message body section and then rewrote the
>> steps for determining the message body length to remove the
>> ambiguities noted previously in tickets #28, #90, and #95.
> 
> This rewording is really good. It clarifies the fact that content-length
> or T-E are recommended even in case of "connection: close" (some servers
> still don't emit them in such a case, Tomcat was fixed just a few weeks
> ago). It clarifies the handling of body requests which could have been
> complex for gateways because it was not perfectly clear what to do with
> the body for unknown methods. Now it's clear. The expected behavious on
> multiple content-length issue is clearly stated, so that looks fine to
> me. It is good that you have put a sentence about smuggling attacks, it
> will make implementors be more careful as it's not just a matter of
> compatibility with broken implementations. This section has always been
> one of the most sensible ones of the spec for me and now it appears
> very clear.
> 
> I just have one comment on the block below :
> 
>>   3.  If a message is received without Transfer-Encoding and with
>>       either multiple Content-Length header fields or a single Content-
>>       Length header field with an invalid value, then the message
>>       framing is invalid and MUST be treated as an error to prevent
>>       request or response smuggling.  If this is a request message, the
>>       server MUST respond with a 400 (Bad Request) status code and then
>>       close the connection.  If this is a response message received by
>>       a proxy or gateway, the proxy or gateway MUST discard the
>>       received response, send a 502 (Bad Gateway) status code as its
>>       downstream response, and then close the connection.  If this is a
>>       response message received by a user-agent, the message-body
>>       length is determined by reading the connection until it is
>>       closed; an error SHOULD be indicated to the user.
> 
> On rare occasions, I have observed duplicated content-length headers
> in tcpdump captures. For this reason, I made the choice in haproxy to
> only accept them if *all* values are equal. I seem to remember that
> Squid considers the max of them (which covers any possible smuggling)
> and forwards only this one.
> 
> Do you think that such behaviours should now be changed (due to the
> "MUST" above), at the risk of occasionally blocking responses ? I'm
> asking because the user-agent does not have as strict a requirement,
> and when you deploy a proxy or gateway somewhere and that some 502
> start to appear, users yell that the implementation is buggy because
> it worked without it due to the fact that the U-A was more tolerant
> (very common). So in my opinion, either we should make the U-A reject
> the incorrect response (in order to get buggy apps fixed), or we may
> relax the check for proxies and gateways in a responsible way (both
> behaviours described above seem efficient).

I think that multiple values of the same length would be okay,
though it will make the section a bit more complex.

I don't believe that choosing the max value is sufficient.
The message may have already passed through an intermediary
that looked only at the smaller value (e.g., limiting a virus
scan to that length) and so it must be exposed as an error.
Choosing the max would hide that error.

....Roy
Received on Tuesday, 10 August 2010 23:30:02 UTC