Sporadic core dumps (dangling timer??)

Hi!

I'm experiencing a mysterious problem running a relatively simple application using libwww 5.4.0 under Digital Unix. The application is doing HTTP POST requests sequentially to a fixed host (running an Apache Server) in a LAN. 

From time to time, I get core dumps with the same kind of symptoms: The backtrace sems to indicate that the libwww code calls a callback function which is set to a data pointer resulting in an illegal instruction error.

The error happens when in HTEvtLst.c , 

PRIVATE int EventListTimerHandler (HTTimer * timer, void * param, HTEventType type)

is executed (with type=HTEvent_TIMEOUT ) and calls a callback function in line 229: 

...

if (sockp->timeouts[HTEvent_INDEX(HTEvent_OOB)] == timer) {

event = sockp->events[HTEvent_INDEX(HTEvent_OOB)];

HTTRACE(THD_TRACE, "Event....... OOB timed out on %d.\n" _ sockp->s);

// line 229:

return (*event->cbf) (sockp->s, event->param, HTEvent_TIMEOUT);

// =======

}

...

However the callback pointer is really a pointer to some data object resulting in a core dump. Probably (at least) the event data was already freed and re-allocated to some other object.

Also, when inspecting the "timer" argument to EventListTimerHandler , it turns out that this is infact an HTHost object, not An HTTimer object. 

This HTHost object might even happen to offer some interesting clues: it's timestamp suggests it was created right before the core dump, it is in state TCP_NEED_CONNECT , reqsMade = 1 and registeredFor = 16 ( HTEvent_CONNECT) . 

The machine this is happening on is known to experience network problems from time to time and is operating under a heavy CPU load. This bahavior could not be reproduced on other machines (with lower CPU usage and w/o network probs).

Can anybody offer some clues what condition may cause this bahavior (and how to fix this)?

Thanks in advance,

Heinz-Bernd Eggenstein

 

Received on Monday, 3 February 2003 11:19:56 UTC