Re: [#95] Multiple Content-Lengths from Willy Tarreau on 2011-03-10 (ietf-http-wg@w3.org from January to March 2011)

From: Willy Tarreau <w@1wt.eu>
Date: Thu, 10 Mar 2011 02:17:50 +0100
To: Adrien de Croy <adrien@qbik.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20110310011750.GB30468@1wt.eu>
On Thu, Mar 10, 2011 at 01:40:28PM +1300, Adrien de Croy wrote:
> There's another way to look at it.  If all software bounced the invalid 
> response, the problems would soon be fixed.

In theory yet, but in practice it's not what we observe. Incompetent
developers write whatever code that enters production and nowadays with
N-tier applications, they write poor code on both sides, and call their
implementation "HTTP" because one of the function they use at one point
surely has in its name.

When any compliant device enters the arena, things break. More or less
seriously, and more or less quickly. I spend my time arguing against
those people that respecting standards is extremely important because
the cost of staying away from them increases as time passes. Still,
it's not easy because they just read what they want to understand. So
in the end they believe strongly they're right and the rest of the
world is wrong. Since they take a lot of time to discover the rest of
the world, once in a while they fix a bug or two in their apps until
they start to work even with similar products. It's just a matter of
randomness in those environments, and that's why flexibility is very
important there.

> So, there are some vendors that would choose to bounce, and some would not.
> 
> Those that do not may be opening themselves up to security 
> vulnerabilities or other problems.

Yes, indeed. Reason why I'm saying that when fallbacks are known to be
possible without taking risks, we should indicate them, it avoids the
worst things to be done.

> As a vendor of a product that bounces it, you could then argue that your 
> software was more secure.

Oh they know that, but when their apps are broken, being more secure
is the least of their concerns.

> In my experience, bank IT managers are among the more security-conscious
> of customers.

That's true on the design side, not on the operations side. The cost of
breakage is so high that any solution will fit. The first example is the
common "I want to be able to replace this firewall with a cable in no
delay in case of trouble with it".

> If all the C-L values are the same, it's perhaps not so dangerous to 
> strip duplicates, but otherwise choosing the smallest value may result 
> in truncation of the entity, and choosing the biggest may result in 
> response smuggling.

Exactly ! That's why I was advocating for accepting them only when they're
exact duplicates. All other cases are very dangerous, and only the duplicate
can happen by accident and/or poor implementations somewhere.

> There's no way of knowing at the time you need to 
> formulate a response for a waiting client.  If it's an attempted 
> smuggle, it could possibly be detected by looking for something like a 
> response at the offset specified by the smallest C-L value.  But if a 
> proxy has already sent the larger C-L value on to the client it's 
> hosed.  Likewise if the proxy sends the smaller C-L value on to the 
> client and the resource keeps coming.

I agree we should not try to guess anything. When I was confronted to
this question for haproxy, it was a tough one. I first chose not to
accept duplicates, but I already had encountered some of them a few
times. Finally, I realized that duplicates were acceptable due to
their nature, and I reject anything else. I've never ever had any
issue on production with that solution.

> So maybe we need to take a tougher stance.
> 
> In the end, we're talking about the spec here.  People sending multiple 
> C-Ls are already ignoring the spec.  People who want to not bounce 
> multiple C-Ls could likewise ignore the spec.  The spec however should 
> be clear on what should happen - what we want as a community the 
> direction to be. Otherwise it seems like a slippery slope.

As I said in another mail, it's not only C-L. There are some components
or frameworks where developers have little control over headers received
and/or sent. Sometimes they blindly "set" a header, hoping it was not
there, assuming the risk of duplicate, and you get two of them on the
wire once in a while. That's pretty disgusting but that's what many
applications we're using on a daily basis are doing, unfortunately. If
there weren't that amount of protocol normalization performed by reverse
proxies at boundaries, we would see even much more ugly things.

In my experience, duplicates do happen. People emitting them have no real
excuse for that, they should read specs and manuals before coding and file
bug reports if needed, but those things do exist. However I've not seen
conflicting values of a same header (whether it was content-length or
content-type).

Regards,
Willy
Received on Thursday, 10 March 2011 01:18:22 UTC