Re: HTTPbis -10 drafts published from Willy Tarreau on 2010-07-14 (ietf-http-wg@w3.org from July to September 2010)

From: Willy Tarreau <w@1wt.eu>
Date: Wed, 14 Jul 2010 10:16:25 +0200
To: Julian Reschke <julian.reschke@gmx.de>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20100714081625.GD2717@1wt.eu>
On Wed, Jul 14, 2010 at 09:42:11AM +0200, Julian Reschke wrote:
> On 14.07.2010 09:28, Willy Tarreau wrote:
> >Hi,
> >
> >On Tue, Jul 13, 2010 at 08:18:10AM +0200, Julian Reschke wrote:
> >><http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-10#appendix-D.11>
> >
> >In this one we would need to add a bit of clarifications about the
> >value domains that may be used for integers. I have identified 2
> >situations where we can encounter integer overflows, with either
> >just impossibility to transfer data, or more important, security
> >issues :
> >
> >- 6.2.1. Chunked Transfer Coding
> >   chunk-size     = 1*HEXDIG
> >
> >- 9.2. Content-Length
> >   Content-Length-v = 1*DIGIT
> >
> >In both cases, we don't indicate any minimal size requirement.
> >It looks like both values are generally assumed to be 31 or 32
> >bits depending on signedness. BTW, I've once been unable to
> 
> How did you come to that conclusion?

Observations, tests and implementations. When testing some products,
it's often very instructive to try to send them too large content-length
values or chunked encodings. And implementations have to choose a type
for their value anyway. It's common to find "int" or "unsigned int"
there. Even in haproxy, I chose "unsigned int" (32 bit) for the chunk
size, and positive 63-bit value for the content length. At least I've
been very careful to detect all overflows everywhere, but this is also
because I'm sensibilized to the security implications. Other implementers
may simply consider that "that large is enough and if it fails above that
it does not matter".

> >download a DVD image from a site because the announced content
> >length was wrong due to integer overflow. I don't remember what
> >server it was though :-/
> 
> That's a server bug. If the content length exceeds what the server can 
> express in Content-Length, it should switch to chunked encoding.

Agreed, and that might also be why I've only seen that once ;-)

> >I think it is difficult to ask every implementation to support
> >at least XXX bits, but we should at least add as a strict
> >requirement that any agent along the chain must detect integer
> >overflows when parsing or printing these values, and must abort
> >processing if this happens.
> 
> I think that requirement is already implicit; if you can't parse a 
> length, don't pretend you can.

I agree with that, the issue is being aware that you can't parse.
I've seen HTTP parser code in enterprises... Believe me, sometimes
you're thankful to the application server below which already does
some cleaning. Most people are not aware of integer overflows and
will only think with "large enough" quantities. Above that is
infinite. The fact that an apparently large number decodes as a
small one is non-sense for some. Basically, you'd find things such
as :

   l = atol(get_header("content-length"));

And once you tell the guy that there is no error check and he even
supports negative values, he would simply switch to :

   l = strtoul(get_header("content-length"), &err, 0);

(which now supports octal decoding).

Since there are already some precisions in the draft, such as "any
value >= 0 is valid", I think it makes sense to add a few words at
some places to enforce certain things. Something like saying that
Content-length is a DECIMAL representation might ring a bell to the
implementer.

> >must be taken to ensure that the decoding is forced to base 10,
> >otherwise a number such as 0123 may be automatically interpreted
> >as the octal value for the decimal 83.
> 
> That's true, but again, that's a quality of implementation issue.

I completely agree. But if we're purist, it's noted nowhere that
the values are expressed in decimal form and must be decoded as
such ;-) The only thing that is indicated is that a DIGIT represents
a decimal 0-9.

> We've had similar discussions before: it's not clear that considerations 
> like these belong into the actual HTTP spec. On the other hand, a 
> separate document discussing issues like these could certainly be 
> written (IETF Informational, or even somewhere else?).

I think we could have something generic somewhere. But there are a
few points in HTTP which, if poorly implemented, may have important
side effects. A single sentence such as "Implementations MUST detect
integer overflows and integer parsing errors" for the Content-Length,
Bytes-range and chunk size is not too much and can help getting more
reliable and interoperable implementations.

Best regards,
Willy
Received on Wednesday, 14 July 2010 08:16:55 UTC