- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Wed, 25 Oct 95 15:00:13 MDT
- To: Beth Frank <efrank@ncsa.uiuc.edu>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Your message raises a number of fairly subtle points. I'll try to address them without being too verbose or repetitive. For the benefit of the innocent bystander, the question at issue is "should the server set a predetermined limit on the number of requests that can be made over one TCP connection?" Beth says "yes", I say "no". Our interest is not to prevent the client from getting the data, it is to force the client to get back in line again, so that service is divided among those requesting it in a reasonable fashion. Here we have two competing philosophies about how to schedule service for multiple clients. Beth's model is "the client makes up to N requests, then drops the TCP connection and attempts to create a new connection". My model is "the connection is kept open as long as possible; the server decides which of the open connections to handle next." Beth's assumption is that the number of available processes/threads/ connections at the server is small enough that the server cannot simultaneously have connections to all of the clients who want to talk to it. My traces of several busy servers (see my SIGCOMM '95 paper) suggest otherwise: a relatively small number of connections suffices. (We can argue about what is "relatively small"; see below.) If NCSA is interested in giving me access to some traces from their own servers, I can analyze them to see if their clients behave differently from ours. If the server IS able to maintain connections to all of the active clients at once, then the process/thread scheduling mechanism of the server's operating system should divide service "among those requesting it in a reasonable fashion." But forcing the clients to reconnect after a few requests makes it much harder to allocate server CPU cycles fairly among the clients, since there is no longer any long-term knowledge about how many cycles a client has consumed recently. Of course, there is a practical upper limit to the number of connections that a server can support. My simulations showed that an LRU approach to terminating connections worked pretty well, but I admit that I did not simulate either of these approaches: o Terminate after N requests (Beth's model) o Terminate the connection that has made the most requests although there's a lot of history on the side of LRU algorithms. I concede that there is a case to be made that LRU will fail if the "working set" (number of clients with immediate service demands) is larger than the resource set (maximum number of connections). In this case, I would guess that terminating the client which has made the most requests would be an appropriate way to provide some fairness, although it (nor any other algorithm) can prevent overload. > (3) Return "try again later" without keeping a refusal list > > Your argument against #3 is basically that the overhead of saying > "no" is more than the overhead of servicing the request. If we > believe that the client is willing to pay any attention to a > "try again later" error (and we could make this a mandatory > part of the HTTP spec), then I don't buy your argument. The problem is not getting the client to pay attention to "try again later" but getting the USER to pay attention to it. The problem is the users who site there hitting the "reload" button. If the client browser knows that it is not allowed to contact the server again until a certain period in the future, then the client browser could simply ignore the "reload" button until then. (An anti-social browser programmer might not do this, which is why the server should keep a refusal list in order to provide an incentive.) > I would argue that if the clients properly implement the "try > again later" mechanism, then (1) the server should almost never > have to search the refuse list, and (2) relatively few clients What? Once a refuse list is implemented every incoming request would have to be checked against the refuse list. That means the server would have to search the refuse list EVERY time. Not exactly. The refuse list only has to be searched o For new connections o during times when the system is overloaded (and hence has been refusing connections) but I will admit that this is not the best time to add more work to the server's job description. However, I challenge anyone to suggest a better way of providing fair service in the face of overload. Simply letting the clients grab new connections first-come first-served is likely to favor aggressive clients with low-latency network connections. The fallacy in this argument is assuming that CPU and memory are the limiting factors. The limiting factors for very busy sites with T1s or better and our server are the # of child processes available for servicing a request and the ability of the TCP layer to handle the incoming request queues. Let's be careful here. The number of child processes is typically limited by the amount of RAM available to map their working sets without thrashing, and the amount of CPU time it takes to context switch between them without thrashing. The issue of the TCP request queue limit (more precisely, the limit in BSD-based systems on the number of connections in the SYN_RCVD state, or in the ESTABLISHED state before the accept() completes, sometimes known as the SOMAXCONN limit) is a trickier one. There are well-known techniques for getting around this limit, but it is important to point out that if you ARE having problems with this limit, the LAST thing you want to do is to cause the server to have to deal with more TCP connection requests. Any mechanism (such as closing TCP connections after N requests) that increases the number of new TCP connections made can only make matters worse. The server hits against operating system limits in the TCP/IP layer and the # of file descriptors supported by the system. I'm not sure which "limits in the TCP/IP layer" you are referring to. In addition to the SOMAXCONN limit, there is a well-known problem with the PCB lookup algorithm in BSD UNIX, but a solution was described in SIGCOMM '91 (more than four years ago) and has been implemented by several system vendors. The limit on the number of file descriptors may be a problem, especially for a multi-threaded server (as opposed to a forking server). I may be biased because Digital supports up to 4096 file descriptors per process; is this really a problem on other modern UNIX systems? (As far as I remember, even the Mac TCP supports about 50 connections.) My simulations showed that 50 connections ought to be enough for almost all cases, but I am happy to be contradicted by actual evidence (traces) from other servers. Which returns me to the fundamental issue: can the server actually support a sufficiently large number of TCP connections at once? Nobody has provided quantitative evidence that this is not true. -Jeff
Received on Wednesday, 25 October 1995 15:36:30 UTC