- From: Richard Atterer <richard@list03.atterer.net>
- Date: Wed, 12 Feb 2003 18:34:29 +0100
- To: www-lib@w3.org
Hello, sometimes this mailing list is quite depressing to read, because there are always so many people with problems, and so few people with solutions. So I thought that since ATM I'm making some progress with my own use of libwww, I should share it to save others some work. So: If an HTTP or FTP download is running, how do you pause/stop/resume it? The bad news first: AFAICT libwww doesn't support FTP resumes (REST command) at all. :-( Maybe I'll come up with a patch. Pausing a download while leaving the connection open libwww doesn't actually make provisions for this, but it's relatively easy to do (read: took me all of two days ;-/ ). You prevent the socket of the download from being polled by select(), which means that the connection stays open, but no data is transmitted over it. This is nice because continuing is faster and because some servers (and some proxies like WWWOFFLE) don't support HTTP ranges, so resuming the download in a new connection isn't possible. For me, the following works with the glibwww event loop: /* The HTNet object whose socket we'll unregister from the event loop. This will prevent more data from being delivered to it, effectively pausing the request. */ HTNet* net = HTRequest_net(request); unsigned protocol = HTProtocol_id(HTNet_protocol(net)); if (protocol == 21) { /* Protocol is FTP, which uses a control connection (which corresponds to the main HTNet object) and a data connection. We need the HTNet object for the latter. */ ftp_ctrl* ctrl = static_cast<ftp_ctrl*>(HTNet_context(net)); net = ctrl->dnet; } // Unregister socket HTEvent_setTimeout(HTNet_event(net), -1); // No timeout for the socket SOCKET socket = HTNet_socket(net); HTEvent_unregister(socket, HTEvent_READ); Unfortunately, this is a bit ugly: To make the FTP stuff work, you need to copy the definitions of enum _HTFTPState and struct _ftp_ctrl from HTFTP.c to your application's code. Continuing a download whose connection is still open Analogous to the above, except we register the socket: // Register socket again /* For some weird reason the timeout gets reset to 0 somewhere, which causes *immediate* timeouts with glibwww - fix that. */ HTEvent* event = HTNet_event(net); HTEvent_setTimeout(event, HTHost_eventTimeout()); HTEvent_register(HTNet_socket(net), HTEvent_READ, event); Obviously, all requests in a HTTP pipeline get paused with this, not just the current one. In particular, the paused request may only be pending; in this case, the transmission of an earlier request in the same pipeline may actually get paused - not what we want. I partially solve this by pausing only the moment my request actually receives data via the write() method of the HTStream object I registered as the request's output stream. Immediately after continuing, the connection may be dropped, e.g. if the user paused, disconnected his modem, dialed in again, and then told the app to continue. Consequently, when an app uses this "soft pause", it should also be able to do a full resume with a new connection. Aborting a download There are a few minor pitfalls here. HTHost_killPipe closes all sockets for that host instead of just the one of the connection, so use HTNet_killPipe(HTRequest_net(request)) If your app crashes when you call this, it is because you're calling it from within the write() method of your stream - libwww doesn't like it if you delete the request object etc. from "right underneath its feet" while it's still processing the data that has just arrived on the request's socket. Instead, wait until the main event loop is reached again, and *then* kill the pipe. With glibwww, this is easy to do because you can use g(tk)_idle_add() to register a function which will be called back once the main loop becomes idle. Obviously, this kills all requests in the pipeline, so you should re-schedule all the ones which you do want, or resume them if they were already downloaded in part. (AFAICT, the HTTP 1.1 standard doesn't allow you to selectively cancel just some pending or active requests, so the only thing libwww can do is to close the connection.) How do I tell whether the download has succeeded/failed? HTAlert_setInteractive(YES); HTAlert_add(myAlertCallback, HT_A_PROGRESS); and pay attention to the HTAlertOpcode passed to myAlertCallback() How do I distinguish between the connection being dropped due to an error and the end of the transmission? With FTP, AFAIK you can't tell, unless you scan the directory listing first to find the file size. (Haven't explored how difficult this would be.) With HTTP downloads, the server will /usually/ have sent a Content-Length header, so you can check whether the promised number of bytes has already been received. Do *not* use HTAnchor_length(HTRequest_anchor(request)) to read the number of bytes, because for some reason this is not set up correctly for "206 Partial Content" responses. Instead, use HTResponse_length(HTRequest_response(request)) which, for a 206, returns the number of bytes in the partial request, i.e. total length - requested start offset. Resuming a download starting with a certain byte offset. Requires HTTP ranges (i.e. HTTP 1.1) . Before starting the download with HTLoad(request, NO), use HTRequest_addRange(request, "bytes", "333-999") to fetch bytes 333-999 (both inclusive, starting from 0), you can also use just "333-" to fetch from byte 333 onwards. Beware that a non-HTTP 1.1 aware server may just ignore the range request and send the data starting with offset 0 - to detect this case, you need to check whether the server sent a Content-Range header like "Content-Range: bytes 303104-17242732/17242733". The header is present if HTResponse_range(HTRequest_response(request)) returns non-null. This only works for HTTP, i.e. if HTProtocol_id(HTNet_protocol(HTRequest_net(request))) == 80. Further problem with FTP: "REIN" libwww doesn't behave correctly with my test FTP server (running OpenBSD's ftpd): When libwww wants to reuse an existing control connection, it first issues REIN, which the server doesn't understand ("502 REIN command not implemented."). Next, it thinks it has to send "USER anonymous" again, which doesn't work either ("530 Can't change user from guest login.") At this point libwww gives up, when it could actually just proceed with the RETR. Grr... patch forthcoming. All of the above is based on work on my program "jigdo" - see the "download.cc" file in its sources. (But wait until the next release, the current 0.6.9 code doesn't yet have all the pause/resume code.) Cheers, Richard -- __ _ |_) /| Richard Atterer | CS student at the Technische | GnuPG key: | \/¯| http://atterer.net | Universität München, Germany | 0x888354F7 ¯ '` ¯
Received on Wednesday, 12 February 2003 18:25:28 UTC