- From: Henrik Frystyk Nielsen <frystyk@microsoft.com>
- Date: Tue, 21 Dec 1999 08:13:23 -0800
- To: "Amir Khassaia" <AKhassaia@vet.com.au>
- Cc: <www-lib@w3.org>
> Hi I am trying to customize the web-robot app that comes with LibWWW to > add some features to it. > One of them is downloading files only up to certain size. > I have tried putting some code together for detecting and hadling it, > but the functions that return the size of the file like HTAnchor_length( > ) , HTResponse_length( ) or HTAnchor_header( ) only work correctly in > some places (like terminate_handler). > What is the best place to start putting that sort of code in the LibWWW > ? The reason why it is not known until the after filters is that we first need to get the response headers from the server and even then it is not guaranteed that we know the size. The response may be using chunked transfer encoding, or it may not have a content-length header and instead close the connection in order to delimit the message. The client can of course just abort the connection but that it often a very heavy handed mechanism - especially if using HTTP pipelining (which libwww does under the covers). The only way to get around this is to first issue a HEAD request which if it contains a Content-Length header field gives you the size of the response body. That it, you first do a HEAD and then a GET if you want to get the whole thing. The way you do this is to register an after filter that looks at the result of the HEAD and if the response is less than a certain size then it changes the method to GET and reissues the request. Workflow wise, this is similar to what you have in the HTRedirectFilter AFTER filter which you can find in http://www.w3.org/Library/src/HTFilter.c and which is registered in HTAfterInit in http://www.w3.org/Library/src/HTInit.c Instead of registering it for a redirection status code, you register it for a HT_OK (200) code, which is the case you are interested in. Henrik
Received on Tuesday, 21 December 1999 11:13:59 UTC