- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Fri, 20 Oct 95 15:55:53 MDT
- To: Beth Frank <efrank@ncsa.uiuc.edu>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Jeffrey Mogul suggested: > If we want a mechanism that allows a server to throttle the request > rate of an aggressive client, we should address that issue directly. > For example (this is just an off-the-cuff idea), the server could > return a "try again in N seconds" error code, close the TCP connection, > and then refuse to accept a new connection from that client until > the N seconds has elapsed. Our adminstrator of the www.ncsa site adamently maintains that anything the server itself has to do to throttle back requests is too much overhead. If the request gets through the TCP/IP accept layer, he believes you're better off just serving the request. This is an interesting theory, but I don't buy it. At least not in the context of setting a fixed limit on the number of requests per connection. The reason is that a per-connection request limit cannot protect the server from a client that simply reopens another connection. Or a client (such as Netscape) that aggressively opens lots of TCP connections at once. Without some explicit signal from the server to the client that it should slow down, I don't expect the client to back off. So if server overload is indeed a problem, then what are the possible solutions? Here are the ones I can think of: (1) Do nothing This is not a particularly useful option! (2) Per-connection limit on the number of requests As argued above, doesn't seem to work (3) Return "try again later" without keeping a refusal list (4) Return "try again later", keeping a refusal list (5) Use TCP to flow-control the client If there are other possibilities, please suggest them! But I don't think #1 is appropriate, and I don't think that #2 actually solves anything, so that's why I suggested thinking about #3, #4, or #5. Note that #5 is orthogonal to the other ones! Your argument against #3 is basically that the overhead of saying "no" is more than the overhead of servicing the request. If we believe that the client is willing to pay any attention to a "try again later" error (and we could make this a mandatory part of the HTTP spec), then I don't buy your argument. It might be instructive to look at some of the work done by Mary Baker, while she was a Berkeley, on the server crash recovery work done for the Sprite distributed file system. The original design was "let the clients do what they want", and it turned out to collapse miserably if the server became overloaded. Mary's redesign was based on the principle that ONLY the server could know when it is overloaded, and thus the server MUST control the rate at which the clients generate requests. In other words, this feedback mechanism MUST be built into the protocol. I think that searching to see if a site is on a refuse list will just further overload the server. This depends on how carefully one implements it. Common sense suggests that the server should start refusing requests before it has entirely run out of resources, in part so that it can spend some of its resources policing the congestion-control mechanism. I would argue that if the clients properly implement the "try again later" mechanism, then (1) the server should almost never have to search the refuse list, and (2) relatively few clients will be on this list at any one time. If the list is kept in a simple tree structure, searches and updates should take log2(N) time, so one could handle hundreds or thousands of clients with a very rapid search. A server implementor is of course free to use the "try again later" mechanism without keeping a refusal list, if it's too hard to implement. Or we might find that all client implementations obey the "try again later" mechanism, and so it's not necessary to police things. I'm simply suggesting this as a way for a server to further protect itself. Note that this mechanism, with or without a refusal list, is independent of whether persistent connections are used. Suppose that we don't trust client implementors to pay attention to a "try again later" message. Then I would argue we could still use TCP's flow control mechanisms to keep aggressive clients at bay, on the assumption that a client won't start opening extra TCP connections just because the response on the current one seems slow. (This theory may be faulty, in that users may hit "stop" and "reload" when they get bored. I do this myself.) > Or the server could keep the connection > open, but simply delay reading from it (which means that the TCP > flow-control mechanism would force the client to stop sending). This would also agrevate the overload problem, because the server would just be that much longer in getting to the next request. I don't understand this. The whole point of a congestion-control mechanism is to defer dealing with some requests. As long as keeping an open-but-not-transferring TCP connection does not cost much in CPU time or memory (and it should not cost that much), this should be a very effective way to limit the rate at which clients make requests. Think of it as if there are only n-slots for TCP connections and a new request from someone else can't be served until one of the n-slots frees up. (This is a very great simplification.) This misses the point. If a client has opened one of the N available TCP connections, and the server has "stalled" that connection in order to have some time to service one of the other ones, the client isn't going to try using one of the other N-1 available connections. This means that other clients are free to use them. As long as the server unstalls a connection whenever it has the resources to service another request, it will always make progress at a near-optimum rate. Any reasonable multi-tasking operating system will do this much more or less automatically. The somewhat tricky part is to get the server to service these connections (clients) in a reasonably fair order, since the operating systems's scheduler may not have enough information to do this. I'm not sure that every HTTP server implementation can solve this (for example, a server that simply forks for each new connection doesn't have enough central control), but I'm not arguing that the spec should require anything; simply that this would be a more effective approach than setting a per-connection limit on the number of requests. -Jeff
Received on Friday, 20 October 1995 16:13:57 UTC