- From: <neumann@nestroy.wi-inf.uni-essen.de>
- Date: Sun May 4 08:47:47 1997
- To: www-lib@w3.org
Dear libwww experts, It is my suspicion, that the close-directive from a server is not handled by libwww 5.1b during asynchronous requests (right? wrong?) or that functions are missing (see below). Here what happens: environment: Linux, Xt driven event loop, Proxy configured initial request: http://www.linuxhq.com/ sent to proxy reply: HTTP/1.1 200 OK Date: Fri, 02 May 1997 23:22:37 GMT Server: Apache/1.2b10 Last-Modified: Fri, 02 May 1997 03:15:36 GMT ETag: "33853-18d0-33695c58" Content-Length: 6352 Accept-Ranges: bytes Connection: close Content-Type: text/html In short, the server on www.linuxhq.com is a HTTP/1.1 server, the proxy cannot handle persistent connections and inserts "Connection: close" into the header. The HTTP 1.1 spec states that this is the correct way to HTTP 1.0 style single requests. The libwww code in 5.1b handles the "Connection: close" by setting the setting the HTHost_closeNotification flag in HTHost to true and leaving everything else (persistent, protocol version etc) the same for the time being. Our code (the cineast browser, see http://sldnt1.slac.stanford.edu/poster/762/index.html from our poster presentation at WWW6) issues libwww requests asynchronous. This means an image request is issued before the loading of the HTML document is completed. In the figure below the x axis is time, the horizontal position of the s in "starting" indicates the time when the request is started, the dots indicate the loading time starting http://www.linuxhq.com/ ................................ starting img1 ..................................... starting img2.................................... With the original 5.1b code linuxhq is loaded, the img1 and img2 requests are sent on the same socket, the proxy closes the connection after linuxhq is finished. Depending on the timing the code might either - catch a SIGPIPE signal when a further image request is submitted to the closed pipe, or - sit in a (non-blocking) read loop hoping in vain that the proxy will send the images Here are my questions: 1) Why does libwww sonly set a flag when the "close connection" tag in the header of the first request (linuxhq) is handled. Why it does it do not something more dramatically (e.g. turning off persistent, setting HT_TP_SINGLE). If I would implement this, what problems would I face? 2) Another place, where the close notification could be handled is in HTHost_new(), when the first image request is handled. PUBLIC HTHost * HTHost_new (char * host, u_short u_port) ... ... lookup host structure ... if (pres) { /* which means there is a host */ if (pres->channel) { /* the host has a channel */ if (pres->expires && pres->expires < time(NULL)) { /* Cached channel is cold */ if (CORE_TRACE) HTTrace("Host info... Persistent channel %p gotten cold\n", pres->channel); HTChannel_delete(pres->channel, HT_OK); pres->channel = NULL; } else { /* the channel can be used */ if (CORE_TRACE) HTTrace("Host info... REUSING CHANNEL %p\n",pres->channel); } } ... } When this code is executed the close_notification flag from the top request has already been noted. I tried to handle the close_notification situation like the "cold channel", but it ends up in a SEGV in HTTee_write HTTee_write <- HTTPStatus_put_block <- HTReader_read <- HTHost_read <- HTTPEvent(HTEvent_READ) It looks like an HTStream is invalid. Maybe there is another pointer pointing to the channel which is deleted above and cleared in the host structure.... Maybe something is missing here in the situation of a close_notification, maybe there is a separate problem. 3) I found a third approach to handle this problem: The HTTP 1.1 spec says that the client should handle a close from the server at any time. In HTTP.c it reads promising in HTTP_CONNECTED: ... case HTTP_CONNECTED: if (type == HTEvent_WRITE) { ... /* ** Should we use the input stream directly or call the post ** callback function to send data down to the network? */ { HTStream * input = HTRequest_inputStream(request); HTPostCallback * pcbf = HTRequest_postCallback(request); if (pcbf) { but this needs a request post callback which has to return a HT_CLOSED state in order to trigger HTTP_RECOVER_PIPE ... status = (*pcbf)(request, input); if (status == HT_PAUSE || status == HT_LOADED) { ... } else if (status==HT_CLOSED) { http->state = HTTP_RECOVER_PIPE; } else if (status == HT_ERROR) { ... } } which does the hard work (flushing the request, recover the pipe, set being state and launch pending requests). Recover pipe will "Move all entries in the pipeline and move the rest to the pending queue. They will get launched at a later point in time.". I find it strange that this functionality needs the post callback, which is nowhere registered (or did i miss it?). It is simple to register a post callback returning HT_CLOSED in HTTP_BEGIN after HTHost_connect() in a close_notification situation. I did so and got the same SEGV in HTTee_write as in (2). So, what is the way to go 1, 2, or 3? Is the SEGV in HTTee_write a separate problem? Are there updated state machines or code specifications available for mere humans outside the W3C? For example the state diagram for HTTP in Library/User/Architecture/HTTP.gif is from Jun 30 1995. Henrik, how did you keep track of the states of the various objects. In my opinion there are many improvements between 5.0a and 5.1b aside the speed, but the new objects/features/changes are not sufficiently covered in the documentation. Any hints are welcome. Best regards -gustaf neumann PS: yesterday, i sent a mail with a fix to libwww@w3.org (as indicated in README.html) under the impression that this address belongs to the public mailing list. My current understanding about the mailing lists is the following: libwww@w3.org: mail to the libwww developers at w3c www-lib@w3.org: public mailing list for discussion www-lib-bugs@w3.org: public mailing list for bug reports Is this correct? Regarding the low traffic and judging from people sending to both lists, would it not be a good idea to merge the last two? I assume that libwww@w3.org is subscribed on the public lists, and we do not have to send submissions that should reach the w3c people to libwww.org as well. Correct? -- Wirtschaftsinformatik und Softwaretechnik Universitaet GH Essen, FB5 Altendorfer Strasse 97-101, Eingang B, D-45143 Essen Tel.: +49 (0201) 81003-74, Fax: +49 (0201) 81003-73 Gustaf.Neumann@uni-essen.de http://mohegan.wi-inf.uni-essen.de/Neumann.html
Received on Sunday, 4 May 1997 08:47:47 UTC