libwww bug with "HTTP 1.0 keep-alive"

Greetings.  I think I found a bug in libwww which I do not know how to
solve cleanly (code attached).  I apologize for the references to statefarm's
website, but that's where I can reproduce this bug.  What
happens is that the first URL loads fine, then on the 2nd URL in
_dispatchParsers, libwww ends up calling the following code from
HTMIME_connection:

                if (HTHost_version(host) < HTTP_11) {
                    HTNet_setPersistent(net, YES, HT_TP_SINGLE);
                    HTTRACE(STREAM_TRACE, "MIMEParser.. HTTP/1.0 Keep
Alive\n");
                } else 

The call the HTNet_setPersistent ends up calling HTHost_clearChannel which
deletes the HTStream object being used in _dispatchParsers.  Hence, when
libwww comes back from the call the HTMIME_connection, it's using pointers
to free'd data.  From there it all goes wrong until the application crashes.

The bug only reproduces if the first URL has been loaded first.  This seems
to be because HTHost_setMode has a check:

        if (mode == HT_TP_SINGLE && host->mode > mode) {

If the 2nd URL is loaded first, or the cached HTHost object table is cleared,
then the code does not enter this if, triggering the deletion.

Here's the stack trace of the malignant free, in case it's of use to anyone:

HTMemory_free() c:\libwww\w3c-libwww-5.2.8\library\src\htmemory.c, 132 
HTChunk_delete() c:\libwww\w3c-libwww-5.2.8\library\src\htchunk.c, 62 
HTMIME_free() c:\libwww\w3c-libwww-5.2.8\library\src\htmime.c, 484 
HTTPStatus_free() c:\libwww\w3c-libwww-5.2.8\library\src\http.c, 874 
HTReader_free() c:\libwww\w3c-libwww-5.2.8\library\src\htreader.c, 46 
HTChannel_delete() c:\libwww\w3c-libwww-5.2.8\library\src\htchannl.c, 237 
HTHost_free() c:\libwww\w3c-libwww-5.2.8\library\src\hthost.c, 1102 
HTHost_deleteNet() c:\libwww\w3c-libwww-5.2.8\library\src\hthost.c, 1133 
HTNet_delete() c:\libwww\w3c-libwww-5.2.8\library\src\htnet.c, 910 
HTTPCleanup() c:\libwww\w3c-libwww-5.2.8\library\src\http.c, 168 
HTTPEvent() c:\libwww\w3c-libwww-5.2.8\library\src\http.c, 1249 
HTNet_execute() c:\libwww\w3c-libwww-5.2.8\library\src\htnet.c, 582 
HTHost_launchPending() c:\libwww\w3c-libwww-5.2.8\library\src\hthost.c, 1227 
HTHost_clearChannel() c:\libwww\w3c-libwww-5.2.8\library\src\hthost.c, 797 
HTHost_setMode() c:\libwww\w3c-libwww-5.2.8\library\src\hthost.c, 900 
HTHost_setPersistent() c:\libwww\w3c-libwww-5.2.8\library\src\hthost.c, 705 
HTNet_setPersistent() c:\libwww\w3c-libwww-5.2.8\library\src\htnet.c, 1071 
HTMIME_connection() c:\libwww\w3c-libwww-5.2.8\library\src\htmimimp.c, 142 
HTMIMEParseSet_dispatch()
c:\libwww\w3c-libwww-5.2.8\library\src\htmimprs.c, 179 
_dispatchParsers() c:\libwww\w3c-libwww-5.2.8\library\src\htmime.c, 270 
HTMIME_put_block() c:\libwww\w3c-libwww-5.2.8\library\src\htmime.c, 341 
HTTPStatus_put_block() c:\libwww\w3c-libwww-5.2.8\library\src\http.c, 851 
HTReader_read() c:\libwww\w3c-libwww-5.2.8\library\src\htreader.c, 201 
HTHost_read() c:\libwww\w3c-libwww-5.2.8\library\src\hthost.c, 1598 
HTTPEvent() c:\libwww\w3c-libwww-5.2.8\library\src\http.c, 1228 
HTLoadHTTP() c:\libwww\w3c-libwww-5.2.8\library\src\http.c, 962 
HTNet_newClient() c:\libwww\w3c-libwww-5.2.8\library\src\htnet.c, 807 
HTLoad() c:\libwww\w3c-libwww-5.2.8\library\src\htreqman.c, 1643 
launch_request() c:\libwww\w3c-libwww-5.2.8\library\src\htaccess.c, 77 
HTLoadAbsolute() c:\libwww\w3c-libwww-5.2.8\library\src\htaccess.c, 90 
HTLoadToStream() c:\libwww\w3c-libwww-5.2.8\library\src\htaccess.c, 130 
load() c:\libwww\w3c-libwww-5.2.8\library\examples\chunk.cpp, 184 
main() c:\libwww\w3c-libwww-5.2.8\library\examples\chunk.cpp, 212

Another interesting note is that the HTNet_setPersistent call from inside
HTMIME_connection gets invoked recursively - the delete seems to happen
after the inner HTNet_setPersistent has completed successfully.  I don't know
if this is correct behaviour or not, but is seems odd.

Wendell

-----------
#include "WWWLib.h"
#include "WWWApp.h"
#include "WWWInit.h"

char *data[] = {
   "http://www.statefarm.com/sponsors/natgeo.htm", 
   "http://www.statefarm.com/sponsors/images/natgeo.gif"
};

int printer(const char * fmt, va_list pArgs) {
    return vfprintf(stderr, fmt, pArgs);
}

int tracer(const char * fmt, va_list pArgs) {
    return vfprintf(stderr, fmt, pArgs);
}

int main() {
    HTProfile_newNoCacheClient("Test", "1.0");
    HTPrint_setCallback(printer);
    HTTrace_setCallback(tracer);
    HTSetTraceMessageMask("sop");

    for (int i = 0; i < sizeof(data)/sizeof(data[0]); ++i) {
        HTRequest *req = HTRequest_new();
        HTRequest_setPreemptive(req, YES);
        HTRequest_setMethod(req, METHOD_GET);
        HTRequest_setOutputFormat(req, WWW_SOURCE);
        HTRequest_addConnection(req, "close", "");

	HTLoadToChunk(data[i], req);
	HTRequest_delete(req);
	// Adding "HTHost_deleteAll();" here causes the bug not to
	// occur, but that's not a good solution.
    }
    HTProfile_delete();
    return 0;
}

Received on Friday, 21 May 1999 19:52:51 UTC