- From: Fred Covely <fcovely@covely.com>
- Date: Thu, 3 May 2001 21:02:40 -0700
- To: "Jens Meggers" <jens.meggers@firepad.com>, <www-lib@w3.org>
Jens: Based on the MS site doc, I'd say that MAY get the job done, but maybe not. Unless I am reading the doc wrong (which is quite possible), I think the fundamental problem is that MS doesn't want that connect call going off more than once. I'm worried that this may fix what we are seeing now, but that another variation will turn up. Here is the MS comment: " As a result, it is not recommended that applications use multiple calls to connect to detect connection completion. If they do, they must be prepared to handle WSAEINVAL and WSAEWOULDBLOCK error values the same way that they handle WSAEALREADY, to assure robust execution. " What I like about your approach is using the select, rather than the connect and the error code, to deduce what happened. Using just your code, I'd move the select logic above the if (NETCALL_WOULDBLOCK(socerrno)) line. That way, any unsuccesful status out of connect would yield an immediate check as to completion via the select. (FYI, for fast connections, that will actually improve performance). For slow connections that will add some overhead because the select is called, rather than just the getLastError as is the case now. But I think the reliability is more important. Finally, in looking at this before, I feel that what would really be useful is a new state "TCP_CONNECTING", so you always issue the connect and then go to connecting. The select would then be put in the TCP_CONNECTING logic. That way the select is the do or die point. From there you either up the state to TCP_CONNECTED or back down to TCP_NEED_CONNECT on a failure. All that being said, its quite possible your proposed fix will do the job forever and ever. Did that help? Regards and thanks so much for the outstanding effort on this. Fred Covely fcovely@covely.com (B)760-631-8157 (C)760-717-9689 -----Original Message----- From: Jens Meggers [mailto:jens.meggers@firepad.com] Sent: Thursday, May 03, 2001 6:09 PM To: 'Fred Covely'; www-lib@w3.org Subject: RE: NT SP6 and WouldBlock in HTTCP.C - Seemingly broken Fred, during my latest experiments, I finally found a NT 4.0 service pack 4 installation that caused the problem you described. After calling connect() two times, it returns with an error, but WSAGetLast() error returns 0 altough the connection is already set up. There is a quite clean workaround for that. We can use the select() command to check whether there is a connection or not before throwing back an error code. This can be done right before the "if (socerrno == EISCONN) {" statement in the TCP_NEED_CONNECT state of HTDoConnect(). I've inserted the following code: #ifdef WWW_WIN_ASYNC // check if socket is connected. If yes, enter next stage { fd_set writefds; struct timeval timeout; int select_ret; FD_ZERO(&writefds); FD_SET(HTChannel_socket(host->channel), &writefds); timeout.tv_sec = 0; timeout.tv_usec = 0; select_ret = select ( 0, NULL, &writefds, NULL, &timeout ); if (select_ret == 1) { host->tcpstate = TCP_CONNECTED; HTTRACE(PROT_TRACE, "HTHost %p going to state TCP_CONNECTED.\n" _ host); break; } } #endif I also attached my HTTCP.c to this mail. Please not that it also continas the code for checking the event error code. What do you think? Regard, Jens -----Original Message----- From: Fred Covely [mailto:fcovely@covely.com] Sent: Dienstag, 24. April 2001 17:29 To: Jens Meggers; www-lib@w3.org Subject: RE: NT SP6 and WouldBlock in HTTCP.C - Seemingly broken Congratulations, thats the fastest bug fix I ever saw (8>). I'm real curious as to your solution. I'm not seeing any issues with this on win2k, only NT. But the user I have on NT SP6 fails hard everytime. I think I'll go ahead and set up an NT/SP6 box and see if I can reproduce with and without your patch when it comes out. regards, Fred Covely fcovely@covely.com (B)760-631-8157 (C)760-717-9689 -----Original Message----- From: Jens Meggers [mailto:jens.meggers@firepad.com] Sent: Tuesday, April 24, 2001 5:08 PM To: 'Fred Covely'; www-lib@w3.org Subject: RE: NT SP6 and WouldBlock in HTTCP.C - Seemingly broken Fred, I solved the problem within the last weeks. I had to pass the error message that the asyn event messages of the socket call is carrying with the event object to the HTDoConnect method. Actually, it works. Unfortunatly, a lot of files are involved. I will send a patch description asap. Jens -----Original Message----- From: Fred Covely [mailto:fcovely@covely.com] Sent: Dienstag, 24. April 2001 17:09 To: www-lib@w3.org Subject: NT SP6 and WouldBlock in HTTCP.C - Seemingly broken I have run into an interesting problem on NT SP 6 on at least one machine. I've done a detailed trace on the box that is failing and on several Win2K boxes that work. Clearly based on the source comments there has been a lot of work done in this area, so I would request your input on this problem. Here is the scenario: We are doing a simple httpget on a public site (yahoo.com, or whatever). On a win2K machine the sequence in HTTCP.C looks like this: 1224 15:50:44 Event....... Registering socket for HTEvent_CONNECT 1224 15:50:44 HTDoConnect. rcode `10035' 1224 15:50:44 HTDoConnect. status `-1' 1224 15:50:44 HTDoConnect. WOULD BLOCK `www.yahoo.com' 1224 15:50:44 Host Event.. WRITE passed to `http://www.yahoo.com' 1224 15:50:44 HTDoConnect. rcode `10056' 1224 15:50:44 HTDoConnect. status `-1' 1224 15:50:44 HTHost 01099988 going to state TCP_CONNECTED. 1224 15:50:44 Event....... Socket 456 unregistered for HTEvent_CONNECT ... 1224 15:50:44 DNS weight.. Home 5 has weight 0.00 1224 15:50:44 HTHost 01099988 connected. 1224 15:50:44 Host connect Unlocking Host 01099988 1224 15:50:44 StreamStack. Constructing stream stack for text/x-http to */* The Win2K request then proceeds normally. On the NT SP 6 machine the same request looks like this: 286 14:40:54 Event....... Register socket 248, request 0082B640 ... 286 14:40:54 HTDoConnect. rcode `10035' 286 14:40:54 HTDoConnect. status `-1' 286 14:40:54 HTDoConnect. WOULD BLOCK `www.yahoo.com' 286 14:41:39 Host Event.. WRITE passed to `http://www.yahoo.com' 286 14:41:39 HTDoConnect. rcode `10035' 286 14:41:39 HTDoConnect. status `-1' 286 14:41:39 HTDoConnect. WOULD BLOCK `www.yahoo.com' 286 14:42:24 Host Event.. WRITE passed to `http://www.yahoo.com' 286 14:42:24 HTDoConnect. rcode `10035' 286 14:42:24 HTDoConnect. status `-1' 286 14:42:24 HTDoConnect. WOULD BLOCK `www.yahoo.com' 286 14:43:09 Host Event.. WRITE passed to `http://www.yahoo.com' 286 14:43:09 HTDoConnect. rcode `10035' 286 14:43:09 HTDoConnect. status `-1' In inspecting the Microsoft web site on the connect, they clearly state that the preferred implementation is not to use multiple connect calls. I don't have enough familiarity with the libwww code to venture a guess as to what is wrong. Could it be related to the multiple connect strategy? If so has anyone looked at the best way to do this? I see a comment of about a year ago from jens@meggers.com indicating there were known ms problems in this area. I have a hard time believing someone has not figured out how to do an absolutely bullet proof connect in windows. Any input greatly appreciated. Fred Covely fcovely@covely.com (B)760-631-8157 (C)760-717-9689
Received on Friday, 4 May 2001 00:08:22 UTC