Re: Question on HTTP 408 from Willy Tarreau on 2014-06-04 (ietf-http-wg@w3.org from April to June 2014)

From: Willy Tarreau <w@1wt.eu>
Date: Wed, 4 Jun 2014 14:49:10 +0200
To: Michael Sweet <msweet@apple.com>
Cc: "William Chan (?????????)" <willchan@chromium.org>, HTTP Working Group <ietf-http-wg@w3.org>, Matt Menke <mmenke@chromium.org>
Message-ID: <20140604124910.GV3154@1wt.eu>
Hi Michael,

On Wed, Jun 04, 2014 at 07:52:27AM -0400, Michael Sweet wrote:
> Willy,
> 
> On Jun 4, 2014, at 12:42 AM, Willy Tarreau <w@1wt.eu> wrote:
> > Hi Michael,
> > 
> > On Tue, Jun 03, 2014 at 07:54:18AM -0400, Michael Sweet wrote:
> >> William,
> >> 
> >> This sounds like a pretty obvious bug - a HTTP server only responds when it
> >> has received something.  The usual keep-alive timeout of connections is
> >> silent (server just closes the connection, with a preceding TLS shutdown for
> >> HTTPS).
> > 
> > Please could you check my other response to Roy ? In short I do instead
> > think that responding 408 is the only way to gracefully shut down *and*
> > inform the client it can safely retry and I really think it was designed
> > for this exact purpose.
> 
> I don't think it was designed for being returned when no request had started
> coming in, but rather when a partial request was received and then timed out.

It's hard to navigate through the W3C archives to find the original discussion
for this status code introduced in HTTP/1.1. Some minutes talk about request
timeout :

  http://lists.w3.org/Archives/Public/ietf-http-wg-old/1994SepDec/0257.html

On the WG, 408 was already mentionned in a few exchanges here :

  http://lists.w3.org/Archives/Public/ietf-http-wg-old/1995MayAug/0340.html
  http://lists.w3.org/Archives/Public/ietf-http-wg-old/1995SepDec/0158.html


The first mention of it I could find in a published draft was in draft-00
dated 1995/11/22 :

    http://www.w3.org/Protocols/HTTP/1.1/draft-ietf-http-v11-spec-00.txt

And the text says :

  "
   408 Request Timeout

   The client did not produce a request within the time that the 
   server was prepared to wait. The client may repeat the request 
   without modifications at any later time.
  "

At least I'm not seeing any mention about starting to send anything, there
is a client and it did not produce a request, that's how I've always read
it. Also, please note this specific point in -p2 :

   http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-26#section-6.5.7
   6.5.7. 408 Request Timeout
   The 408 (Request Timeout) status code indicates that the server did
   not receive a complete request message within the time that it was
   prepared to wait.  A server SHOULD send the close connection option
   (Section 6.1 of [Part1]) in the response, since 408 implies that the
   server has decided to close the connection rather than continue
   waiting.  If the client has an outstanding request in transit, the
   client MAY repeat that request on a new connection.

Note the last sentence : "*If* the client has an outstanding request ...".
For me that clearly means that the client might very well receive a 408
while it did not have an outstanding request, which implies that no byte
was sent over the wire yet.

> There is nothing in any of the HTTP RFCs that talks about returning a
> response line, headers, and message body to the client without first
> receiving something from the client.

Of course, but the same is true for the reverse proposition, nothing
says it's not possible. And in fact, 400 bad request is a perfect
example. If the client sends unparsable crap (so nothing matching
even the beginning of a valid request), a 400 bad req will be
returned. We're talking about protocol errors here leading to a
connection being closed as a result, I'm not seeing anything wrong
with that.

> What CUPS does (and this was largely based on asking the question, "What does
> Apache HTTP Server do?" many years ago) is to silently close the connection
> after the configured timeout, which is large enough (300 seconds for CUPS and
> Apache) to account for any reasonable traffic delays (on the Earth, at
> least).

But as I said in the other thread, the spec prevents the client from retrying
a non-idempotent request if the server does this, while 408 explicitly allows
it to. So I think that the intent behind the 408 was pretty clearly to inform
the client that no damage was made and that it had a chance to retry. On silent
close, you must not resend a POST for example, since it might very well have
already been processed :

   http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-26#section-6.3.1
   6.3.1. Retrying Requests
   Connections can be closed at any time, with or without intention.
   Implementations ought to anticipate the need to recover from
   asynchronous close events.

   When an inbound connection is closed prematurely, a client MAY open a
   new connection and automatically retransmit an aborted sequence of
   requests if all of those requests have idempotent methods (Section
   4.2.2 of [Part2]).  A proxy MUST NOT automatically retry non-
   idempotent requests.

   A user agent MUST NOT automatically retry a request with a non-
   idempotent method unless it has some means to know that the request
   semantics are actually idempotent, regardless of the method, or some
   means to detect that the original request was never applied.

> I can understand if you don't want to have as large of a timeout for a proxy
> that is shared by thousands of clients, but you can definitely make this
> configurable and default to something reasonable (30-60 seconds).  If the
> network connectivity is so bad that the request line and headers don't start
> flowing within that timeframe, then there is probably no point in retrying
> automatically anyways.

I'm seeing some sites running with smaller timeouts to limit the risk of DoS.
When you know that you can accept 300k connections per second, with a 5s
timeout you already have 1.5 million connections. That's not unreasonable
to use aggressive timeouts in such conditions (especially for sites which
run older pre-forked servers which cannot accept such large numbers).

And you can clearly have a few seconds pause on 3G during a handover. I've
observed multi-second delays which cumulate with TCP retransmits to result
in something in the 10s range. The handover is an interesting case because
the network condition is temporarily bad and is good again once you get the
response, so there's a real incentive to retry instead of telling the user
"I'm sorry, I tried to validate your order but something has cut in the
middle so I don't know if you'll get it or not, please check your cart".

In the case of haproxy, sending the 408 only if something was started on
the connection is a trivial change, and I even delayed its final release
by a few days so that we have the time to discuss this use case and decide
what the proper cut-off mode is. But I think it would be a wrong change
that degrades the client's ability to seamlessly recover. For me the most
likely explanation for a close is "the server crashed after receiving the
request" (FIN) or "before receiving the request" (RST). 408 makes it pretty
clear that it's neither.

What would be interesting in fact is to know how various clients act on
silent close an on 408, with idempotent and non-idempotent requests. And
based on that we should likely clarify the text on what should be done.
Pat said that Firefox handles 408 fairly well (I found a bug report for
it being fixed in 2004). I don't know however how a POST over a silently
closed connection is handled there. Maybe non-browser clients like Curl
would help us determine the best behaviour as well. Daniel ?

Regards,
Willy
Received on Wednesday, 4 June 2014 12:49:46 UTC