Re: Broken pipes & lost requests

> Azzurra Pantella wrote:
> 
> Hi all,
> I am working on a robot-like client application using libwww to submit
> big amounts of GET requests to the server side.
> Time ago we came across a memory growth problem due to the HTAnchor
> structures lifetime (once allocated they are NEVER freed till the end
> of the program). Trying to solve this problem with a periodic "garbage
> collection" of old
> HtAnchor objects every N submitted requests, we noticed that the
> library counter of active HTNet objects "HTNetCount" could
> remain indefinitely postive as if some requests had got lost.
> Consider that the HTNetCount is incremented every time an HTNet object
> is added to the NetTable (Hash table containing all the HTNet) and
> decremented when, after the reception of the response to the submitted
> request, the matching HTNet object is deleted and removed from the
> NetTable.

Then every time a request will be un-answered you miss to decremented
the counter.

Register a user timer for each request you start. (b.e. 30 sec.)
Remove the timer when you get the answer.
NetKill the request if the timer trigger.

> We noticed that this requests loss took place only if there had
> previously been some SIGPIPE signal reception while
> writing to  the network (HTWriter_write() in HTWriter.c).
> That led us to suspect the presence of a BUG as in the HTTP state
> machine realized in the HTTPEvent() function ( HTTP.c module) the
> reception of a SIGPIPE after a write does NOT cause a recovery. In
> fact in case of broken pipe the returned value HT_CLOSED is never
> checked. I suggest a behaviour similar to that after a HTHost_read
> (l.1249 in HTTP.c, HTTPEvent() function): in case of HT_ERROR kill the
> pipeline, in case of broken pipe (HT_CLOSED return value) try to
> recover the pipeline.
> 
> This is the bug-fix I'm proposing (HTTP.c line 1236):
> 
> /* Now check the status code */
> if (status == HT_WOULD_BLOCK)
>     return HT_OK;
> else if ( status == HT_PAUSE | | status == HT_LOADED) {
>     type = HTEvent_READ;
> } else if ( status == HT_ERROR) {
>     http->state = HTTP_KILL_PIPE;
> } else if ( status == HT_CLOSED )
>     http->state = RECOVER_PIPE;
> 
> instead of:
> 
> /* Now check the status code */
> if (status == HT_WOULD_BLOCK)
>     return HT_OK;
> else if ( status == HT_PAUSE | | status == HT_LOADED) {
>     type = HTEvent_READ;
> } else if ( status == HT_ERROR)
>     http->state = HTTP_RECOVER_PIPE;
> 
> 
>  And now 2 questions to the libww community:
> 1) Is what I have noticed really a bug or is there any reason not to
> recover after a sigpipe in the write branch ?

No bug. (Or at least not here ;-)
For me there is obvious reason not to recover when the stream have been
closed:
there is nothing to recover.
Recovering is for pipeling and pipe is closed.

If you start multiple request toward the same HTTP 1.1 host they will
pipeline:

out> GET req_1
out> GET req_2
out> ...

in<  page_1
in<  page_2
in<  ...

If you decide page_1 is too long to load you could decide to stop it.
(register your own socket in the select to indicate this
 because the lib is not multithread safe. or register a timer)

When you kill the request (in fact the loading request of the Net
structure)
recovering will stop loading page_1 but will not close the pipe and
loading 
of page_2 will start whitout the need to repeat the resting requests
'cause
the server already receive them

out> GET req_2
out> ...


> 2) The buffer toward the network is flushed also when an HTTimer bound
> to an HTNet object is dispatched. In that case
>     within the FlushEvent() in HTBufWrt.c  the return  value from the
> HTBufferWriterFlush() is NOT CHECKED AT ALL !

Problably checked another way.

>     Even in this case after a SIGPIPE the pipelined HTnet objects
> flushed away will never be recovered.
>     We tried forcing an HTHost_recoverPipe() when HT_CLOSED is
> returned but this attempt of recovery unfortunely
>     causes seg.violations (!!!!!) and it's not appliable .
>     Is there some other way to recover the pipe in this case? Don't
> you think this this FlushEvent() function needs
>     to be revised anyway (some sort of ret. value check)?
> 
> HELP! Has anybody encountered similar problems?
> Thanks !
>                    Azzurra
> 

Do not modify the lib. Check patches in this mailing list AND
check if other agree with.

Do not call HTAnchor_delete it have never been writen done.

Last bug I found that 'cause crash :

Do not call HTNET_setRawByteCount(YES) because of bug 
(I believe but miss time to check... more later)
in HTMIME.c file.

> Azzurra Pantella
> Mobile Applications Area
> NETikos S.p.A.
> Via Matteucci 34/B
> 56124 PISA
> Tel.: + 39 050 968526
> Fax: + 39 050 968525
> e-mail:  azzurra.pantella@netikos.com
> Internet: www.netikos.com

Received on Tuesday, 18 December 2001 00:25:59 UTC