Possible libwww-5.1m bug in HTHost_free or HTChannel_new

I will preface this message with the fact that I have only been using
this library for about 2 weeks, but I spent 3 days tracking down
this weird problem.

I have a program that is periodically connecting to webserver using the
HTLoadToChunk function. The problem was that after 25 calls to this
function it would no longer return any content. It would return a chunk,
but it would have nothing in it. After spending all sorts of time trying
to figure out what was going on in the source code I realized that it was
hitting some sort of max number of open connections. The trace information
was saying that it was closing the sockets, but everytime it tried a new
request the file descriptor would be one number higher. I then noticed
that after 25 HTLoadToChunk calls the requests were being put on some sort
of pending queue. The weird part about this was that this problem only
occured when I was trying to connect to this one specific webserver we
have here. I made a modified version of chunk.c that repeatedly made
requests to a URL. For most websites it seemed to work fine, but the
actual webserver that I needed to connect to didnt work right. I
discovered that with other webservers the library was actually
establishing a persistent connection and reusing that connection for
repeated requests. The webserver that I needed to get information from was
apparently not allowing persistent connections. This led me to a long hunt
through the code to find what the differences were between these 2
situations. What I found was a situation in HTHost_free where the channel
is cleared, but the channel is not actually closed. HTHost_free looks like
this
PUBLIC BOOL HTHost_free (HTHost * host, int status)
{
    if (host->channel) {

        /* Check if we should keep the socket open */
        if (HTHost_isPersistent(host)) {
               // Bunch of stuff for Persistent connections
        } else {
            if (CORE_TRACE) HTTrace("Host Object. closing socket %d\n",
HTChannel_socket(host->channel));

            HTHost_clearChannel(host, status);
        }
    }
    return NO;
}

The problem that I think I found is that if a channel is not persistent it
just calls HTHost_clearChannel. The problem is that the Semaphore member
of the channel struct is set to 1 when the channel is created and
HTHost_clearChannel will only delete the channel if the Semaphore is set
to 0. Now here is where my understanding if the big picture of library
gets a little fuzzy. Why is the semaphore set to 1 when the channel is
created? If you called HTChannel_new and then immediately called
HTChannel_delete the channel would not get deleted. That seemed a little
weird to me. This basic phenomemon happens when a connection is not
persistent. A channel is created and then it is closed. The problem is
that the channel is not being deleted and therefore the library thinks
that the channel is still open. Since the connection was not persistent
the HTHost struct clears its references to the channel, but the channel
sticks around. To fix this problem I changed HTHost_free to look like this

PUBLIC BOOL HTHost_free (HTHost * host, int status)
{
    if (host->channel) {

        /* Check if we should keep the socket open */
        if (HTHost_isPersistent(host)) {
               // Bunch of stuff for Persistent connections
        } else {
            if (CORE_TRACE) HTTrace("Host Object. closing socket %d\n",   
HTChannel_socket(host->channel));

// This line forces the Semaphore to be set to 0
            HTChannel_setSemaphore(host->channel, 0);

            HTHost_clearChannel(host, status);
        }
    }
    return NO;
}

This has the effect of forcing the semaphore to 0 and allows
HTChannel_delete to actually delete the channel. I did this because I
wasnt sure if there was some sort of deep dark secret reason for setting
the semaphore to 1 when the channel is created in HTChannel_new. I also
did it this way because in HTHost_free there are a set of lines in the
persistent part of the code that says this
                /*
                **  By lowering the semaphore we make sure that the
channel
                **  is gonna be deleted
                */
                HTChannel_setSemaphore(host->channel, 0);
                HTHost_clearChannel(host, status);

I figured that this was a good place to get a solution to my
problem. It seems to do what I need it to do. Once I made this change and
recompiled the whole library everything started working beautifully.
Seeing as I dont really have a completely informed view of how everything
works together I dont really know how this impacts all aspects of the code
in the library, but it fixed the problem that I was having. If this is
wrong please look into the problem and send me the 'right' solution. I'm
sorry about the long winded explaination, but I think it was necessary to
describe my thought process and reasons for doing what I did. Thanks for
your help and I'd appreciate any comments.

Thanks again,
Aaron Colwell

Received on Thursday, 25 June 1998 12:18:45 UTC