Re: libwww.a 4.0 problems from Henrik Frystyk Nielsen on 1996-01-11 (www-lib@w3.org from January to March 1996)

From: Henrik Frystyk Nielsen <frystyk@w3.org>
Date: Thu, 11 Jan 1996 16:13:01 -0500
To: "Chris Bloom" <bloom@aladdin.genmagic.com>
Cc: www-lib@w3.org
Message-Id: <9601112113.AA12672@www20>
> I have run into some problems using the library on an sgi in non-blocking mode
> from within a firewall using socks.  Any pointers to other experiences, papers,
> docs, etc related to using the library in nonblocking with socks would be
> appreciated.

I am not much of a socks expert but I can help on the other questions.
 
> PROBLEM #1: If the nonblocking connect using HTDoConnect() has not really
> connected by the time select is called using HTEvent_Loop() then select never
> wakes up.   My workaround is a sleep() between  HTLoad() and my return back
> into HTEvent_Loop().  This works most of the time but is not guaranteed since
> it may take longer for a site to actually connect.   I believe this is a
> socks/sockets problem on Solaris which is what we are using to run the socks
> daemon.

This is very strange because HTDoConnect is written so that we never in 
practice connect the first time. As connect always takes some time, the first 
call will always block if we used blocking sockets and so this is in practice 
the first time we return to the event loop when a load has been initiated. 
That is, the call stack will at the time of the first call to HTDoConnect 
contain something like

	HTDoConnect		(Library)
	HTLoadHTTP		(Library)
	HTLoadAnchor		(Library)
	user event handler	(App)
	HTEventLoop		(Library)
	main			(App)

When the control returns to the event loop, select is told to wait for "ready 
for write" on the socket in which case we can continue the request and finish 
the connection. This time, the call stack is much shorter as we now have a 
direct connection between the event loop and the protocol load module:

	HTDoConnect
	HTLoadHTTP
	HTEventLoop
	main

In fact, it doesn't make any difference whether you use the Library eventloop 
or you have supplied your own version. I am not aware of your problem, but it 
does sound like socks may have something to do with it.

> PROBLEM #2: For a specific request only calls the nonblocking read once.
>  Whatever has been written to the socket by the remote server is read by the
> library in HTSocketRead() which pushes it down the stream.   The stream returns
> a status which in my case is HT_OK.  This seems like the right thing for my
> callback to do, right?

Yep, HT_OK indicates that the stream has handled the data and can accept more 
data.

> However the status on returning from the stream push
> code is 2999.

You probably mean 29999 which is HT_LOADED. This means that all the data 
expected has been read. The HT_LOADED can either be returned directly by the 
socket read loop (HTSocket_read() in HTSocket.c) as a result of the socket 
read function returns 0 _or_ it can be returned from a stream because no more 
data is expected. An example is the MIME parser stream - in order to support 
persistent connections it must use the content length information in the 
header of the message. When the right number of bytes have been read it 
returns HT_LOADED.

> This results in the "stream specific return code" (shown below)
> being executed which return the status.  I tried returning HT_WOULD_BLOCK from
> my stream callback but that did not work since that block of code (also shown
> below) would unregister the socket.   So currently my fix is to continue
> returning HT_OK from my stream callback which results the execution of the
> "Stream specific return code" which I have hacked to return HT_WOULD_BLOCK.  Am
> I doing something wrong.  Is this really a bug in the library.  And would this
> be the cause of PROBLEM #3 described below?
> 
> 	/* Now push the data down the stream */
> 	if ((status = (*target->isa->put_block)(target, isoc->buffer,
> 						b_read)) != HT_OK) {
> 	    if (status==HT_WOULD_BLOCK) {
> 		if (PROT_TRACE)
> 		    TTYPrint(TDEST, "Read Socket. Target WOULD BLOCK\n");
> 		HTEvent_UnRegister(isoc->sockfd, FD_READ);
> 		return HT_WOULD_BLOCK;
> 	    } else if (status>0) {	      /* Stream specific return code */
> 		if (PROT_TRACE)
> 		    TTYPrint(TDEST, "Read Socket. Target returns %d\n",status);
> 		isoc->write = isoc->buffer + b_read;
> 		/* return status; */
> My fix ---->	return HT_WOULD_BLOCK;

This is not a good idea. The reason is that stream modules now can't send 
return codes upstream and back to the protocol load module. In this case the 
HT_LOADED return code from HTMIME will get lost and persistent connection will 
stop working because it doesn't realize that a HTTP message has been received.

> PROBLEM #3:  The URL http://www.pathfinder.com is a redirect.  This is not
> handled properly in non-blocking mode.  I am currently working on this.  What
> happens is that I am returned the empty page from the result of the redirect
> than the library goes on to call the site specified by the redirect and never
> calls me back.  Any info on this would be appreciated.

It should be, both the linemode browser and the command line tool does the 
right thing - are you using the latest version of the Library  which is called 
4.0B? Please check that you have the right version from

	http://www.w3.org/pub/WWW/Distribution.html

thanks!

-- 

Henrik Frystyk Nielsen, <frystyk@w3.org>
World-Wide Web Consortium, MIT/LCS NE43-356
545 Technology Square, Cambridge MA 02139, USA
Received on Thursday, 11 January 1996 16:13:14 UTC