Re: Timeouts on Multiple Requests


I forward this to the <www-lib@w3.org> mailing list as it is of general 

Richard.Keeble@brunel.ac.uk writes:
> Hi there.  I am a PhD student at Brunel University, and I'm currently
> developing some software using the Library of Common Code to retrive HTML
> documents.

> I would like my application to be able to have a number of HTTP GET requests
> operating concurrently, with timeouts set on each in order to keep track of
> any failures, etc.
> As you already know, the select()-based timeout mechanism does not fully
> support multiple timed-out requests.  The central issue is that the library
> provides only one timeout, not one per request as would be needed for my
> application.

Yes, the current event loop is designed for a typical GUI application where it 
is important to have a regular call back to the application in order to update 
any progress information or to make something spinning, jumping etc. _and_ to 
have the application respond to user input as fast as possible. This is why 
the select() call is called very often in the current implementation.

Currently you get the notification but do not get any information about what 
what request object the time out belongs to. In the case of a GUI application, 
the user will often stop the request if it takes too long.

However, in your application (and in most other robots and servers) you must 
know this so that you can kill the request automatically. You are also 
interested in minimizing the amounts of system calls, so you need to use the 
results returned from select() in a better way. One way to do this is to go 
through _all_ ready socket descriptors and handle them before select is called 
again. You also need an additional layer on top of the select call in order to 
keep track of which descriptors have timed out and should be killed.

I am actually implementing an alternative eventloop that can be used in the 
W3C mini robot and in the W3C mini server as I need it to make them full 
functional applications.
> I would like to suggest a possible solution to this problem, and to go
> ahead and implement a patch for it (assuming nobody's done this yet).
> I'd appreciate some feedback on this, if possible.

> Proposed patch to allow timeout processing for multiple requests
> ================================================================
> - Maintain a list of timeout (seltime) structures, one per timed request.

Sockets are registered and unregistered on a regular basis when running the 
event loop. When a socket is not blocking it is not registered. Whether you 
want to maintain the seltime request between two registrations of a socket 
depends on whether you want to set a total max time for handling a request.

I am not so sure that this is a big problem as long as the connection is 
active. However, if a socket is idel for a long time then it should be dropped 
and the request should be killed. Therefore, what I suggest is to associate 
either a time entry or a simple integer counter (in case you want to base it 
on the timer already running in the select call) in the event structure 
(called ACTION) binding the socket to the request object.

When walking through the total set of socket descriptors, if a socket is not 
ready then increment the counter or check if the time difference to the 
current time is too big and then make the decision whether to kill the request 
or not.
> - Order these according to length of timeout.
> - Form associations between HTRequest objects and seltime structures.
> To register a timeout
> ---------------------
> Clear any pending timeout for the request.
> Create a seltime object with pointer to associated HTRequest object.
> Traverse the current timeout list, calculating time-to-wait incrementally,
> finally inserting in correct position.
> In the HTEvent_Loop select()
> ----------------------------
> Call select() using the timeval structure in the first seltime object (to
> keep track of elapsed time), if any.
> When a timeout occurs
> ---------------------
> Call the callback associated with the first seltime object.
> Remove and destroy the seltime object.
> When a request is deleted
> -------------------------
> Check the list of seltime objects for any that refer to the request to be
> deleted.  If found, adjust succeeding timeout value accordingly, and remove
> and delete the seltime object.
> Admission
> =========
> It would be more elegant to associate the seltime object with the
> HTRequest's HTNet object, but since the HTNet object is not created until
> the request is started, some pretty nasty callback trickery would be
> required to keep track of things.  I also want to try to keep any changes as
> localised as possible, so I don't have to think too hard!
> Modifications required (at a guess)
> ===================================
> Two modules would need to be modified - HTEventrg.c and HTReqMan.c.
> - To HTEventrg.c:
>  o Modify seltime-handling, using a linked list of requests.
>  o Modify HTEvent_Loop to use a timeval in the seltime list.
>  o Modify HTEvent_registerTimeout to perform allocation and chaining, etc.
>  o Add a function to delete any seltime objects referring to a
>    given request.
> - To HTReqMan.c:
>  o Modify HTRequest_delete to call the seltime cleanup function in
>    HTEventrg.c.  (This is the kludgey bit)

You don't have to do this if you have the time object as part of the internal 
event structure.

> Also, HTEventrg.h[tml] would need to have a prototype for the timeout cleanup
> function added.

You have many good ideas about this - it look like a good plan! Please keep me 
informed on how it workd out!


> PS : Should I send this to the mailing list?  I checked to see if there were
>      any relevant postings but didn't find any.

It's always a good idea to send general questions to the <www-lib@w3.org> 
mailing list. Simply for the reason that other people may have thought about 
the same and that I may come up with some nonsense :-)


Henrik Frystyk Nielsen, <frystyk@w3.org>
World-Wide Web Consortium, MIT/LCS NE43-356
545 Technology Square, Cambridge MA 02139, USA