Re: Apache and HTTP/1.1

On Wed, 19 Feb 1997, Roy T. Fielding wrote:

> Basically, what we are seeing is a lot of connections being left in
> FIN_WAIT_2 state after the server has closed the connection (for whatever
> reason, but usually due to a pre-request timeout).  The one known
> source of the FIN_WAIT_2 state (wherein the server's TCB is waiting for
> the client to send a FIN or RST) is due to current clients with keep-alive
> enabled -- they open multiple connections to the server and then leave
> them open, never checking to see that the connection has been closed,
> and thus never closing their side of the connection.  That is a generic
> problem with client/server applications, so the only real solution is
> to either require all clients to be good (not likely) or include a
> FIN_WAIT_2 timeout within the TCP implementation of the OS.

One other known cause is the fact that many clients (eg. Navigator on
several plastforms and MSIE) send a RST to terminate a connection instead
of a FIN.  They do this because it makes things nicer in some ways when
the client is aborting a connection.  The problem is that during a normal
close, if the RST gets lost on the way to the server then it can NOT be
retransmitted like a FIN would because the protocol doesn't allow for
that; this isn't a problem when the client aborts.  With current versions
of Navigator and MSIE on most OSes, when the server is closing a
connection it sends its FIN and gets back an ACK; the client then needs to
close the other half; if the RST it sends to do so gets lost, the server's
socket will remain in FIN_WAIT_2 until it times out or, if the server
doesn't have a timeout for that state, until the next reboot.  Needless to
say, this is bad.  When too many FIN_WAIT_2s build up on a server it tends
to crash. 

This is really an implementation problem with the client.  Whatever
arguments there are for sending a RST when the client wants to abort a
connection (and the reasons appear valid), I do not see any reason to send
it for a normal (err... normal as in a normal transaction, since a large
percentage of HTTP connections are terminated by being aborted by the
client) server-initiated close.  Someone at Netscape has said they will
probably be changing this so that Navigator only sends a RST when the
client is aborting the connection, not during a server-initiated close.
Unsure about what MS plans.

> 
> The other source of FIN_WAIT_2 states is still unknown, but is somehow
> connected to the way we are lingering on a half-closed connection
> in order to ensure the client has time to ACK the last response sent
> to the server, and the server has enough time to receive the ACK before
> it fully-closes the connection.  This lingering behavior exists to avoid
> the TCP buffer reset problem discussed in several notes within the
> HTTP/1.1 specification.  However, I believe that the cause of this
> problem is either in our implementation or that of the OS implementation
> of the shutdown() system call, and therefore not a problem with HTTP/1.1
> per se.

I agree, and it should be noted that this last unknown reason appears to
be the main problem, since the problem can be drasticly reduced by not
using the lingering close that Roy refers to.  I can find nothing wrong
with the Apache implementation, but that doesn't mean it isn't there.  My
suspicious currently point at something in the server TCP stack which,
while it may not be a bug, is acting in a less than desirable manner. 

Roy is correct that this is not a HTTP spec problem, but a TCP and/or
socket implementation problem.  Probably just something to keep in mind,
but I don't think anything can be done in the HTTP spec about it. 

Received on Wednesday, 19 February 1997 10:49:24 UTC