- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Thu, 06 Jul 95 12:03:17 MDT
- To: Chuck Shotton <cshotton@biap.com>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
>Not necessarily. My simulations, using traces from several busy servers, >show that with certain choices of server parameters, the peak number of >TIME_WAIT entries is well below 1000. That is, the use of sessions >can actually reduce the number of TIME_WAIT entries by an order of >magnitude (compared to non-session HTTP). More details in my SIGCOMM >paper, or look at > http://www.research.digital.com/wrl/publications/abstracts/95.4.html Now, how does that relate to the multitude of non-Unix servers on the Internet? Please remember that WWW <> Unix and HTTP <> Unix and TCP/IP <> Unix. There are LOTS of implementations of WWW software that have absolutely nothing to do with Unix, Unix kernel settings, Unix kernel performance, Unix IP stacks, or anything else to do with Unix. The reason this is an issue is that you cannot predicate protocol decisions solely on the implementation of IP stacks on Unix hosts. Please tell me where I used the word "UNIX" in my message. In fact, absolutely nothing in my simulations is UNIX-specific. The requirement that a TCP implementation maintain TIME_WAIT records is part of the TCP specification. It is a mandatory requirement on all TCP implementations. If you want to argue that non-UNIX servers should not have to meet the TCP specification, that's another story. But then the whole process of creating standards is rather pointless, if that's the case. You can bet that non-Unix TCP/IP stacks have radically different resource constraints and performance issues. That is a valid point. The HTTP specification should certainly not require servers to keep connections open any longer than they want to. A server implementor should keep the relevant resource constraints in mind when setting server policies. My simulations examined a range of server policies, including one in which the server keeps a very small number of connections open. Read the paper to see the results. True, I did not simulate a server that keeps only one connection open at a time. Such a server would presumably not want to use sessions, so I didn't bother to simulate how sessions would affect it. >The reason is that the number of TIME_WAIT entries is directly related >to the number of TCP connections used. If you use sessions (what I >called in my paper "persistent connections"), you need to create fewer >TCP connections for the same number of retrievals. So you end up >with fewer TIME_WAIT entries. This is irrelevant on platforms with a limited number of TCP/IP streams that can be formed. People discussing this issue are right to refer to "irresponsible use" of TCP/IP connections. Nonsense. The paragraph of mine you quote there has nothing to do with what platform the server is running. It is a direct consequence of the protocol specifications. If your goal is to minimize the number of TCP connection records, then use persistent connections. If your goal is to minimize the number of open connections, then you may choose not to use persistent connections (although this is not mandatory; a properly implemented persistent-connection server should be able to limit the number of open connections to any chosen limit, without affecting correctness.) A related, somewhat important piece to this puzzle is the need for HTTP clients to implement a retry scheme when servers report that they are resource constrained with a 50x error code. As far as I know, no clients in widespread use implement retries when a server reports that it is too busy or is unable to service a request due to resource constraints. Implementing this portion of the standard in WWW clients will go a long way towards eliminating the potential race conditions that can arise when a server terminates a session after a client has issued a new request but before the server received it. The client will simply retry. On the other hand, for servers that ARE able to maintain multiple TCP connections, the best way to signal "I'm resource constrained" is to use the TCP flow control mechanisms. That is, if the client has an open connection and the server doesn't have the cycles to keep data flowing on it, the client is inherently blocked from sending more requests (unlike the case with current HTTP, where there is no flow control). So for server systems whose limiting resource is CPU cycles, rather than TCP connections, sessions could be a big win. >I also suspect that much of the benefit comes NOT from imbedded >images, but from subsequent requests for HTML pages (i.e., the >user clicks, reads, and clicks again). The last thing I want to do with a resource-constrained server is re-implement the nightmare of hundreds of blinking cursors in otherwise idle telnet sessions. Huh? Nobody is asking for that. The server does not have to commit any CPU cycles to the idle HTTP connections (beyond timer maintenance, which is actually cheaper on idle open connections than on TIME_WAIT connections). The HTTP protocol is primarily connectionless (stateless) for reasons of efficiency from the server's perspective. Statelessness does not imply efficiency. Period. Statelessness only affects the need to maintain state. I have never seen a quantitative argument that statelessness improves efficiency of an HTTP server. In fact, there is ample evidence to the contrary. Statelessness costs us in CPU cycles, server memory, packets, and delay. HTTP is stateless, as far as I can tell, because it was the simplest way to get something going, and the original designers didn't know any better. I think a persuasive argument can be made for keeping a stream open while all of the required parts of a single "page" are transmitted. Allowing an individual to monopolize a scarce resource for longer periods of time, on the off chance that a human might select another link to your site from the page he just received, IS irresponsible. Nobody has ever argued that HTTP should be modified in a way that allows an individual to monopolize a resource. The server has complete control. -Jeff
Received on Thursday, 6 July 1995 12:09:39 UTC