Re: I-D Action:draft-nottingham-http-pipeline-00.txt from Willy Tarreau on 2010-08-11 (ietf-http-wg@w3.org from July to September 2010)

From: Willy Tarreau <w@1wt.eu>
Date: Wed, 11 Aug 2010 11:34:13 +0200
To: Mark Nottingham <mnot@mnot.net>
Cc: Adrien de Croy <adrien@qbik.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20100811093413.GA894@1wt.eu>

On Wed, Aug 11, 2010 at 07:17:09PM +1000, Mark Nottingham wrote:
> 
> On 11/08/2010, at 6:53 PM, Willy Tarreau wrote:
> 
> > On Wed, Aug 11, 2010 at 05:29:32PM +1000, Mark Nottingham wrote:
> >> It's interesting, but it would require browsers to spew Req-MD5 headers into requests unconditionally... something that I can't imagine they're likely to want to do (especially at first, when adoption on the server side is low).
> > 
> > Well, it's cheap and scalable. Also, it's hard to imagine that browsers
> > will both want to have pipelining to work and do nothing for that.
> 
> That's a false choice; they don't mind doing something, but they do mind putting extra bytes on the wire for every HTTP request on the planet if it can be avoided.

OK, but a "Req-MD5: <hash>" is not *that* long. And we can even think of a
lot smaller strings, we don't need a cryptographically signed header, we
just need a hint to detect that a response does not match the request.

> > The
> > "connection: pipeline" method suggested by Martin looks perfect to me
> > in this regards, but it does not address the point you want to address
> > which is to try to detect bad intermediaries without having to upgrade
> > them.
> 
> I think we're going to have to disagree, but I'd encourage you to look at this from the perspective of a browser vendor.

Well, I don't see why. If I were a browser vendor, I would like it a lot
to be able to simply emit that header once with the first request on the
connection and be able to rely on the response instead of having to perform
complex heuristics. If this is in order to avoid sending that with every
request, at least that will only be sent with the first request of each
connection on which we plan to send more than one request. This is the
cheapest thing we can do in terms of network traffic.

> > That's why I really think that having the server report the information
> > it got is nice. That way we can ensure that even in case of multiple
> > actors, the reported values are the correct ones.
> 
> I'm not concerned about reverse proxy deployments because they're under the control of the server, and looking at implementations I'm familiar with (Squid, Traffic Server, various others), it's pretty trivial to add a configuration option to take charge of this header. 

But doing it conditionally in complex infrastructures is quite complicated,
and will often result in getting multiple responses depending what path the
request takes, or no response at all. If this is identified by server-side
admins as complex to configure and expensive in terms of bandwidth, it will
probably not be much adopted. Also, the servers where it's the most important
are static file servers, especially the ones that can handle tens of thousands
of requests per second and want to optimize every single bit of processing.
Having them memcpy() the request+uri into a response header will have a small
but noticeable performance impact. This is even more true for the cases where
only a 304 is returned. And if clients notice that they regularly get mixed
responses (eg: header which does not exactly match, or multiple headers with
one or more which matches), they will probably abandon it too.

Don't get me wrong, I'm not saying it's a bad solution, I think it tries to
address a specific issue of servers or intermediaries, at a high cost for
legitimate servers which already correctly handle the behaviour, and at a
higher cost for admins of complex infrastructures, with finally not much
better reliability from the client side.

Regards,
Willy

Received on Wednesday, 11 August 2010 09:34:51 UTC