W3C home > Mailing lists > Public > www-lib@w3.org > April to June 2003

Re: HTLoadToFile crashing after load on FTP URLs, HTRequest_bytesRead inoperant

From: Timothee Besset <ttimo@idsoftware.com>
Date: Tue, 6 May 2003 02:19:38 +0200
To: www-lib@w3.org
Cc: richard@list03.atterer.net
Message-Id: <20030506021938.3e77a2d5.ttimo@idsoftware.com>

Still trying to understand and fix that bug. It is very easy to reproduce,
you just need the command line client and an anonymous ftp on which
there's a file that will take more than the default timeout to download
(20s).

I use this for instance
w3c -v 'ftp://ftp.idsoftware.com/idstuff/wolf/linux/wolf-linux-1.32-1.33-up.x86.run' -o foo.bar

Download goes fine, you see timeout expire, and when it triggers the
download is aborted. File is incomplete, says timeout reading from host,
which is not true.

I am suspecting that the problem would be related to
HTRequest_bytesRead(request) being broken also (i.e. download keeps going,
but request says nothing was read). Because I found those few lines of
code in HTFTP.c's FTPEvent function:

	/*
	** Don't time out the control connection if we are actually recieving data
	** on the data connection
	*/
	if (!(soc == HTNet_socket(cnet) && !(data->complete & 1) && HTNet_bytesRead(ctrl->dnet)>0)) {
	    HTRequest_addError(request, ERR_FATAL, NO, HTERR_TIME_OUT, NULL, 0, "HTLoadHTTP");
	    FTPCleanup(request, HT_TIMEOUT);
	}

The problem with that, is that when the control connection timeout
triggers, the data connection still says bytesRead 0 .. so instead of
hacking the timeout and keep going, it fails..

TTimo

On Mon, 5 May 2003 21:44:12 +0200
Timothee Besset <ttimo@idsoftware.com> wrote:

> 
> Below is the trace when the initial timeout reaches 0 (testing with w3c
> command line). There is something utterly screwed going on there. It tries
> to unregister a socket it can't find, does a few hickups .. then keeps
> going as if nothing. Only problem, at the end of the complete download, it
> will crash.
> 
> TTimo
> 
> FTP Event... now in state FTP_NEED_DATA
> FTP Get Data now in state NEED_BODY
> Read Socket. 1400 bytes read from socket 6
> Memory Free. 0x80ea828
> Memory Free. 0x80e21c8
> Event Loop.. calling select: maxfds is 6
> ............ READ : 4 6
> ............ WRITE:
> ............ OOB  :
> ............ Timeout is 0 s, 1000 microsecs
> Event Loop.. select returns 0
> ............ READ :
> ............ WRITE:
> ............ OOB  :
> ............ Timeout is 0 s, 0 microsecs
> Memory Free. 0x80e21e0
> Timer....... Found timer 0x80e2298 with callback 0x804f1c5, context 0x80e2150, and relative timeout 20000
> Timer....... Dispatch timer 0x80e2298
> Event....... READ timed out on 4.
> Host kill... Pipeline due to HTEvent_TIMEOUT event
> Host kill... Terminating net object 0x80d8290 from pipe line
> Channel..... Semaphore set to 0 for channel 0x80da0a0
> Memory Free. 0x80e21e0
> Timer....... Deleted active timer 0x80e2298
> Memory Free. 0x80e2298
> Event....... No more events registered for socket 4
> Memory Free. 0x80e2150
> Memory Free. 0x80e21a8
> Event....... Socket 4 unregistered for HTEvent_READ
> Event....... Couldn't find socket 4. Can't unregister type HTEvent_WRITE
> Channel..... Delete 0x80da0a0 with semaphore 0, status -902
> Channel..... Delete input stream 0x80da0e0 from channel 0x80da0a0
> Memory Free. 0x80e2190
> Memory Free. 0x80e21f0
> FTPStatus... ABORTING...
> Channel..... Delete input stream 0x80da0e0 from channel 0x80da0a0
> Memory Free. 0x80da0d0
> Socket read. FREEING....
> Memory Free. 0x80da0e0
> Socket write FREEING....
> Memory Free. 0x80e2100
> Net Manager. Decreasing active sockets to 1, 1 persistent sockets
> Channel..... Deleted 0x80da0a0, socket 4
> Memory Free. 0x80da0a0
> Net Manager. 1 active sockets, decreasing persistent sockets to 0
> Host info... removed host 0x80d8f60 as persistent
> Event Loop.. calling select: maxfds is 6
> ............ READ : 6
> ............ WRITE:
> ............ OOB  :
> ............ Timeout is 19 s, 997000 microsecs
> Event Loop.. select returns 1
> ............ READ : 6
> ............ WRITE:
> ............ OOB  :
> ............ Timeout is 19 s, 990000 microsecs
> Memory Free. 0x80ea808
> Timer....... Found timer 0x80e22e0 with callback 0x804f1c5, context 0x80ea5b0, and relative timeout 20000
> EventOrder.. execute ordered events
> Memory Free. 0x80da0d0
> EventList... calling socket 6, request 0x80d1810 handler 0x80863fc type HTEvent_READ
> Host Event.. READ passed to `ftp://<.......>'
> FTP Event... now in state FTP_NEED_DATA
> FTP Get Data now in state NEED_BODY
> Read Socket. 1400 bytes read from socket 6
> Memory Free. 0x80e2190
> 
> 
> On Mon, 5 May 2003 20:25:24 +0200
> Timothee Besset <ttimo@idsoftware.com> wrote:
> 
> > 
> > I have reproduced the crash with the w3c command line client, and found
> > out it is related to timeouts. If the file is small enough to get
> > downloaded below the default 20s timeout of w3c command line client
> > (ComLine/src/w3c), then we are fine.
> > 
> > When it runs I see the timeout go down, and not being reset to 20s each
> > time there is data read from the socket. Once it reaches 0, it gets set
> > again to 20s and stops decreasing, it looks like it's properly maintain at
> > 20s at this point. But then at the end of the download the crash situation
> > I described happens.
> > 
> > I am still looking at what is going on with the timeout thing.
> > 
> > TTimo
> > 
> > On Mon, 5 May 2003 19:53:15 +0200
> > Richard Atterer <richard@list03.atterer.net> wrote:
> > 
> > > 
> > > On Mon, May 05, 2003 at 05:10:02PM +0200, Timothee Besset wrote:
> > > > I keep calling HTRequest_bytesRead on each HT_PROG_READ alert callback,
> > > > but it remains 0. I can see the file being downloaded and growing though.
> > > 
> > > Hmm, my code uses HTResponse_length(HTRequest_response(request)). I'm not 
> > > sure whether that works though, since I'm keeping track of the length 
> > > myself regardless of what libwww says about it.
> > > 
> > > > - After the download has completed, it crashes. Here's what the end of the
> > > > trace says:
> > > [...]
> > > 
> > > It works for me! See the end of this mail for the debug output I get (with
> > > the OpenBSD ftpd and my respective FTP patch applied).  Your FTP server
> > > seems to close the connection on you, mine doesn't. Maybe that makes a
> > > difference.
> > > 
> > > [This FTP patch of mine is unlikely to affect your problem, it only
> > > influences libwww's behaviour before the FTP GET. Check the list archives
> > > if you need it.]
> > > 
> > > > HTDoClose... Close 20
> > > > Net Manager. Decreasing active sockets to 0, 0 persistent sockets
> > > > FTP Get Data now in state NEED_BODY
> > > 
> > > Are you maybe making "recursive" calls to libwww? libwww doesn't like it at
> > > all e.g. if you start/stop requests from within callbacks which you
> > > registered with it.
> > > 
> > > > Actually, I know how the crash happens, but I haven't thought of a
> > > > solution. I thought for a while that it could be related to the ftp
> > > > daemon, but I tried a win32 daemon and proftp, and they both cause the
> > > > same crash.
> > > 
> > > If you want to have an envious look at Code That Works (woohoo!), look at
> > > the things in src/net inside <http://atterer.net/jigdo/jigdo-0.7.0.tar.bz2>
> > > ;-)
> > > 
> > > [...]
> > > > status is HT_CLOSED
> > > > ctrl->repcode is 150
> > > > data->complete is 1
> > > 
> > > FTP code 150 is "File status okay; about to open data connection", looks
> > > like this is still the last status code *before* the error. 
> > > 
> > > > But the data->complete |= 2 doesn't trigger. If it would, I *think* that
> > > > HTFTP.c would finish the job and exit cleanly.
> > > 
> > > I'm not sure what the problem is - please find out whether the server is
> > > using active FTP, whether using passive FTP makes a difference, and whether
> > > the socket that's unexpectedly being closed by the remote end is for the
> > > control or the data connection.
> > > 
> > > Cheers,
> > > 
> > >   Richard
> > > 
> > > -- 
> > >   __   _
> > >   |_) /|  Richard Atterer     |  CS student at the Technische  |  GnuPG key:
> > >   | \/¯|  http://atterer.net  |  Universität München, Germany  |  0x888354F7
> > >   ¯ '` ¯
> > > ...
> > > Read Socket. 2669 bytes read from socket 7
> > > Host Event.. READ passed to `ftp://localhost/image'
> > > FTP Event... now in state FTP_NEED_DATA
> > > FTP Get Data now in state NEED_BODY
> > > Read Socket. FIN received on socket 7
> > > HTDoClose... Close 7
> > > Net Manager. Decreasing active sockets to 1, 1 persistent sockets
> > > FTP Get Data now in state SUB_SUCCESS
> > > FTP Event... now in state FTP_SUCCESS
> > > Channel..... Delete input stream 0x8247078 from channel 0x82248f0
> > > Channel..... Delete input stream 0x8247078 from channel 0x82248f0
> > > Net Object.. Delete 0x8225a70 and call AFTER filters
> > > Host info... Remove 0x8225a70 from pipe
> > > Host Object. closing socket -1
> > > Channel..... Semaphore set to 0 for channel 0x82248f0
> > > Channel..... Delete 0x82248f0 with semaphore 0, status 900
> > > Channel..... Semaphore decreased to 0 for channel 0x82248f0
> > > Host info... removed host 0x8237f98 as persistent
> > > Net Object.. Check for pending Net objects
> > > Net Object.. Freeing object 0x8225a70
> > > Net Object.. Delete 0x823ccc8 and call AFTER filters
> > > Host info... Remove 0x823ccc8 from pipe
> > > Host Object. keeping persistent socket 5
> > > Channel..... Delete 0x822c598 with semaphore 1, status 200
> > > Channel..... Delete input stream 0x823f058 from channel 0x822c598
> > > Channel..... Delete input stream 0x823f058 from channel 0x822c598
> > > Channel..... Semaphore decreased to 0 for channel 0x822c598
> > > Timer....... Created one shot timer 0x822cb10 with callback 0x809e9e8, 
> > > context 0x8224940, and relative timeout 60000
> > > Host........ Object 0x8224940 going idle...
> > > Net Object.. Check for pending Net objects
> > > Net Object.. Freeing object 0x823ccc8
> > > Net After... calling 0x809363c (request 0x8224a60, response 0x823a4c8, 
> > > status 200, context (nil))
> > > 
> > > 
> > 
> > 
> 
> 
Received on Monday, 5 May 2003 20:19:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 April 2007 18:18:43 GMT