- From: <neumann@nestroy.wi-inf.uni-essen.de>
- Date: Sun May 4 08:47:47 1997
- To: www-lib@w3.org
Dear libwww experts,
It is my suspicion, that the close-directive from a server is not
handled by libwww 5.1b during asynchronous requests (right? wrong?)
or that functions are missing (see below).
Here what happens:
environment:
Linux, Xt driven event loop, Proxy configured
initial request:
http://www.linuxhq.com/ sent to proxy
reply:
HTTP/1.1 200 OK
Date: Fri, 02 May 1997 23:22:37 GMT
Server: Apache/1.2b10
Last-Modified: Fri, 02 May 1997 03:15:36 GMT
ETag: "33853-18d0-33695c58"
Content-Length: 6352
Accept-Ranges: bytes
Connection: close
Content-Type: text/html
In short, the server on www.linuxhq.com is a HTTP/1.1 server, the
proxy cannot handle persistent connections and inserts "Connection:
close" into the header. The HTTP 1.1 spec states that this is the
correct way to HTTP 1.0 style single requests.
The libwww code in 5.1b handles the "Connection: close" by setting
the setting the HTHost_closeNotification flag in HTHost to true and
leaving everything else (persistent, protocol version etc) the same
for the time being.
Our code (the cineast browser, see
http://sldnt1.slac.stanford.edu/poster/762/index.html
from our poster presentation at WWW6) issues libwww requests
asynchronous. This means an image request is issued before the
loading of the HTML document is completed. In the figure below the x
axis is time, the horizontal position of the s in "starting"
indicates the time when the request is started, the dots indicate the
loading time
starting http://www.linuxhq.com/ ................................
starting img1 .....................................
starting img2....................................
With the original 5.1b code linuxhq is loaded, the img1 and img2
requests are sent on the same socket, the proxy closes the connection
after linuxhq is finished. Depending on the timing the code might
either
- catch a SIGPIPE signal when a further image request
is submitted to the closed pipe, or
- sit in a (non-blocking) read loop hoping in vain that the proxy
will send the images
Here are my questions:
1) Why does libwww sonly set a flag when the "close connection" tag
in the header of the first request (linuxhq) is handled. Why it
does it do not something more dramatically (e.g. turning off
persistent, setting HT_TP_SINGLE). If I would implement this,
what problems would I face?
2) Another place, where the close notification could be handled
is in HTHost_new(), when the first image request is handled.
PUBLIC HTHost * HTHost_new (char * host, u_short u_port)
...
... lookup host structure ...
if (pres) { /* which means there is a host */
if (pres->channel) { /* the host has a channel */
if (pres->expires && pres->expires < time(NULL)) {
/* Cached channel is cold */
if (CORE_TRACE)
HTTrace("Host info... Persistent channel %p gotten cold\n",
pres->channel);
HTChannel_delete(pres->channel, HT_OK);
pres->channel = NULL;
} else { /* the channel can be used */
if (CORE_TRACE)
HTTrace("Host info... REUSING CHANNEL %p\n",pres->channel);
}
}
...
}
When this code is executed the close_notification flag from
the top request has already been noted. I tried to handle
the close_notification situation like the "cold channel",
but it ends up in a SEGV in HTTee_write
HTTee_write <- HTTPStatus_put_block <- HTReader_read <-
HTHost_read <- HTTPEvent(HTEvent_READ)
It looks like an HTStream is invalid. Maybe there is another
pointer pointing to the channel which is deleted above and
cleared in the host structure.... Maybe something is
missing here in the situation of a close_notification,
maybe there is a separate problem.
3) I found a third approach to handle this problem: The HTTP 1.1
spec says that the client should handle a close from the server
at any time. In HTTP.c it reads promising in HTTP_CONNECTED:
...
case HTTP_CONNECTED:
if (type == HTEvent_WRITE) {
...
/*
** Should we use the input stream directly or call the post
** callback function to send data down to the network?
*/
{
HTStream * input = HTRequest_inputStream(request);
HTPostCallback * pcbf = HTRequest_postCallback(request);
if (pcbf) {
but this needs a request post callback which has to return a
HT_CLOSED state in order to trigger HTTP_RECOVER_PIPE
...
status = (*pcbf)(request, input);
if (status == HT_PAUSE || status == HT_LOADED) {
...
} else if (status==HT_CLOSED) {
http->state = HTTP_RECOVER_PIPE;
}
else if (status == HT_ERROR) {
...
}
}
which does the hard work (flushing the request, recover the
pipe, set being state and launch pending requests). Recover pipe
will "Move all entries in the pipeline and move the rest to the
pending queue. They will get launched at a later point in time.".
I find it strange that this functionality needs the post
callback, which is nowhere registered (or did i miss it?). It is
simple to register a post callback returning HT_CLOSED in
HTTP_BEGIN after HTHost_connect() in a close_notification
situation. I did so and got the same SEGV in HTTee_write
as in (2).
So, what is the way to go 1, 2, or 3?
Is the SEGV in HTTee_write a separate problem?
Are there updated state machines or code specifications available for
mere humans outside the W3C? For example the state diagram for HTTP
in Library/User/Architecture/HTTP.gif is from Jun 30 1995. Henrik,
how did you keep track of the states of the various objects.
In my opinion there are many improvements between 5.0a and 5.1b aside
the speed, but the new objects/features/changes are not sufficiently
covered in the documentation.
Any hints are welcome.
Best regards
-gustaf neumann
PS: yesterday, i sent a mail with a fix to libwww@w3.org (as indicated in
README.html) under the impression that this address belongs to the
public mailing list. My current understanding about the mailing lists
is the following:
libwww@w3.org: mail to the libwww developers at w3c
www-lib@w3.org: public mailing list for discussion
www-lib-bugs@w3.org: public mailing list for bug reports
Is this correct?
Regarding the low traffic and judging from people sending to both lists,
would it not be a good idea to merge the last two?
I assume that libwww@w3.org is subscribed on the public lists, and we
do not have to send submissions that should reach the w3c people
to libwww.org as well. Correct?
--
Wirtschaftsinformatik und Softwaretechnik
Universitaet GH Essen, FB5
Altendorfer Strasse 97-101, Eingang B, D-45143 Essen
Tel.: +49 (0201) 81003-74, Fax: +49 (0201) 81003-73
Gustaf.Neumann@uni-essen.de
http://mohegan.wi-inf.uni-essen.de/Neumann.html
Received on Sunday, 4 May 1997 08:47:47 UTC