- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Thu, 06 Jul 95 14:22:03 MDT
- To: Chuck Shotton <cshotton@biap.com>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
It turns out that Chuck Shotton and I are in perfect agreement. He just doesn't know it, because he's assumed that I'm a UNIX bigot. >Please tell me where I used the word "UNIX" in my message. In fact, >absolutely nothing in my simulations is UNIX-specific. >The requirement that a TCP implementation maintain TIME_WAIT records >is part of the TCP specification. The assumption that thousands of TCP/IP connections are just laying around for use is definitely a Unix assumption. I didn't say anything about TIME_WAIT records, etc. And I never assumed that "thousands of TCP/IP connections are just laying around for use". In fact, if you had read the paper I wrote, you would have seen that I simulated a range of active-connection limits, between 10 and 1024 connections. Offline, Chuck informs me that his server supports 50 connections, and that some PC-based servers support "as few as 10". Fine. My simulations show that with at most 32 active connections, using traces from several servers, persistent-connection TCP can easily get a mean of 2 or 3 URLs per TCP connection. Median values were similar. Even with just 10 active connections, the simulations show about 2 URLs per TCP connection. So this whole debate is a non-issue. Even Chuck's systems should benefit, on average, from using sessions. Of course, the simulations also show that if your system can support hundreds or thousands of active connections, the results will be even better. But it's certainly not required. Unfortunately, I have to hop up on a stump occasionally and remind people not to take a Unix-centric view of the Web. Your message (unfortunately?) happened to be a good excuse for stump-hopping. And I, unfortunately, have to hop up on a stump occasionally and remind people that when experiments and simulations have already been done, uninformed speculation is a waste of everyone's time. But the current HTTP model DOES support blocking requests that cannot be handled. That's what the "server busy" error code is all about. Clients are refused, and explicitly told how long to wait before retrying. However, client authors (Netscape) are slow to adopt this portion of the standard. I'd forgotten about the "server busy" (I assume you mean 503, "service unavailable") error code, in part because I haven't seen it used anywhere. It's a shame that Netscape and other vendors haven't provided support, but I think it's an uphill battle. On the other hand, TCP flow control is inherent in the protocol; if we use that for overload protocol, we are guaranteed that all clients will play by the rules. I'm not concerned about CPU cycles. I am concerned about idle TCP/IP connections. On a system with a finite number of connections, or with a finite amount of memory allocated for TCP/IP buffer space, allowing idle connections to hang around is a Bad Thing(tm). If your TCP implementation uses the same amount of space to represent a TIME_WAIT connection as it does to represent a open connection, the "finite amount of memory" argument works in favor of persistent connections. In fact, this is an even stronger argument than one based on the mean number of URLs retrieved, for subtle reasons that I can explain if anyone really wants me to (or just look at the simulation results). If your TCP implementation is clever enough to represent TIME_WAIT connections more efficiently, then the tradeoff may (or may not) be the other way. But I'd be surprised if anyone has done this. Efficiency of implementation, efficiency of resource utilization. Sorry, I wasn't specific. CPU and network efficiency are not the issues at hand. They can always be solved with bigger CPUs. There are fundamental limits in many IP stacks that limit other aspects of a HTTP implementation and the current connectionless model does a good job of supporting those limits. The only fundamental limit is 300,000 KM/sec. You cannot solve that with a bigger CPU (unless the CPU is so much bigger that its network interface is substantially closer to the client host!) This is the limit that we should be most cautious about, when defining a protocol that may well last for a decade (unlike almost all of the hardware and operating system versions, which will be obsolete quite soon). So our main aim should be to avoid network round-trips. That's what this proposal is meant to do. I would be surprised if vendors of TCP stacks were unable to provide support for more TCP connections. It's really just a small matter of programming. Maybe they just aren't getting the right feedback from their customers. -Jeff
Received on Thursday, 6 July 1995 14:28:30 UTC