Found a bug?

There seems to be a bug in the TCP state engine.  Correct me if I am wrong.
I do not have a very good fix because my fix leaks a counted socket till
shutdown.

The scenario is:
0)	Try to connect to a port which is not being listened to.
1)	An error occurs and you get to TCP_ERROR.  There is a section which
checks for "if(HTHost_isPersistent(host))".  If the host is persistent, the
HTHost is then set to non-persistent.
2)	The state is placed into TCP_NEED_SOCKET.
3)	The TCP_NEED_SOCKET passes control to TCP_NEED_CONNECT.
4)	TCP_NEED_CONNECT fails(?) with HT_WOULD_BLOCK.
5)	Control eventually is passed back to the HTTP state engine and it
passes back an HT_OK.(Kind of ignores the error?  I don't think this
"ignoring the error" is very important but it could be.)
6)	Eventually control is passed back to the loop and then a request to
check all other sockets is issued but there is no timeout?

When a host is set to non-persistent from persistent the method
HTHost_clearChannel is called.  This causes the sockets to be deregistered
and the channel to be deleted but HTHost_unregister is not called with a
HTEvent_CONNECT.  This leaks both a persistent socket count and a socket
count as well as blocking indefinitely when getting events from the
available sockets.

The way to find this is to create a request to a host and a socket which
does not exist and set the host to persistent.  Then issue the request.
It's pretty simple.

My very bad fix because I could not change the library at that time was to
use the alert callback.  In my alert callback I check for a client which I
know to be persistent and the library says is no longer persistent and the
alert is HT_PROG_CONNECT.  This happens only in one place; the HTTCP state
engine TCP_NEED_SOCKET state after a failed TCP_NEED_CONNECT.  I then use
HTHost_unregister with HTEvent_CONNECT and the
HTNet_decreasePersistentSocket (so I don't leak the number of persistent
sockets).

Does anyone else have a fix for this.  A fix in the library I can use but I
don't think mine is entirely right.  I can look at it further but if anyone
else has already fixed this then I could use theirs.

Gary F. Desrochers

Received on Monday, 7 February 2000 09:07:56 UTC