- From: Anselm Baird_Smith <abaird@www43.inria.fr>
- Date: Thu, 27 Feb 1997 09:27:43 +0100 (MET)
- To: francois.deza@sema.fr
- Cc: www-jigsaw@w3.org
Francois Deza writes: > Thanks for your reply. We have done some experiments and want to ask > new questions. Thanks a lot for your time...and for the good questions, > The logger is indeed a major bottleneck, do you think increasing > the w3c.jigsaw.logger.bufferSize property (default 8192 bytes) would > alleviate this > issue. The logger of Jeeves seems far less of an issue. Yes the logger is the bottleneck. I am currently looking at what could be done to improve it, I have a couple solutions that I am going to test RSN (I know at least that using a StringBuffer, and String addition for creating the record is really, really a bad thing to do). I think for your testing, you should probably disable the logger (for the same reasons you should not use a Java client), otherwise you might be measuring the logger speed... > When put upon heavy stress (with the right stresser), Jigsaw refuses > many > connections (the ServerSocket.accept call generates a "Too many file > handles open") > runtime Exception under Solaris. There are limits on the number of file > handles > in the kernel and per process. Is there any way to parametrize it in > Jigsaw > so that we can tune it for heavy load? Not in Jigsaw, you can however use the ulimit program under UNIX. If you're root you should raise it "as much as you can", say at least to twice the number of simultaneous connections you want to handle (see below the SocketClientFactory tuning too) > I have seen that the ServerSocket instantiated in SocketClientFactory > has a hardcoded > backlog equal to 128. Could you comment on that? Why is it so high? It should really be customizable, until it becomes so, I have used an admiteldy high value. Don't forget though, that if you want to handle say 100 simultaneous connections, this number is probably still low. Ultimately the default value should be something (I guess) like (x*numberOfSimConnectionsToHandle) where x depends on the CPU of your machine, and on the iregularities of the # of connection rate (say if you have high peaks of connections at given times) > Could you supply me on guidelines on how to set > the w3c.jigsaw.http.socket.SocketClientFactory properties > minfree, maxfree, maxidle, masclients, idleTimeout and maxThreads. > I mean they are complex interdependencies between those with respects to > the tuning > of the performance of Jigsaw. [skip to the 'example' if this is really unreadable] Note that this is a really interesting point of Jigsaw, the following algorithm comes from my mind, with the hope that it will work well in practice, but if you have better ideas, let me know. Anyway, here is how it goes (you probably want to have the code handy, l-xx means line number xx of SocketClientFactory): The SocketClientFactory has four running states, corresponding to different "load" modes (l-121): - AVG_LIGHT: the server can acquire more sockets and can use more CPU. In this mode, the server will keep connections open for ever, and will accept all new connections. - AVG_NORMAL: the server should not consume more sockets (ie it has already opened near-the-max number of permitted sockets). In this mode, the server will try to close least-recently used connections before accepting a new connection (that it will still always accept). - AVG_HIGH: this is really the same as the above mode, except that "trying" to kill least-recently used connections doesn't seem to suffice to cope with the load. In this mode the accepting thread priority is made lower than the client thread priority (with the hope that request handling gets more CPU than accepting new connections) - AVG_DEAD: we have reached all our resources limit (might be either sockets or CPU), in this mode, the server will start rejecting connections, and emit appropriate error messages in the errlog. The SocketClientFactory maintains the following variables (here client really means an instance of SocketClient, not necessarily a thread yet): - idleCount: the number of clients whose connections is maintained persistent, and which are currently waiting for a request. - freeCount: the number of clients currently unused (ie ready to run new connections) - clientCount: the total number of clients - maxClients: the maximum number of simultaneous clients. Switching between the four load modes (l-257) is based on the value of the above variables and the following "water marks" (they are prefixed with w3c.jigsaw.http.socket.SocketClientFactory as properties): When freeCount is lower than maxFree, the server asssumes LIGHT load (init state). As connections are accepted, freeCount decreases. At some point it becomes lower than maxFree. maxFree is the "hight water mark" for the freeCount counter. Now, if freeCount is still greater than minFree (minFree is the "low water mark" for freeCount) *and* if the number of idle connections has not yet reached its maximum the load mode is turned to NORMAL (maxIdle is the water-mark for idleCount) If one of the above condition is not true, but we still have clients ready to accept new connections, load mode is set HIGH, otherwise (no more free clients), load mode is setto DEAD. example: Before I drop all readers, let's take the default config as an example. The default config assumes 64 max simultaneous connections (which is the default on solaris). This gives us the value for maxClients: maxClients=64/2; (one connection requires at least two file handles) The second step is to decide on a value for maxFree, which controls the point at which persistent connections are going to be killed when accepting new connections. I think this depends on the power of the host, I would recommend that the server should stay in LIGHT mode until 70% of clients are used maxFree=0.3*32=10 The third step is to estimate the point at which you will want to start 'killing' persistent connections: The first parameter is maxIdle.I would recommend using a 30% margin (to account for the time between kill and terminate, as explained above): maxIdle=0.7*maxClients=22 [the default is pessimisticly set to 20] The second parameter is minFree. The server will remain in LIGHT mode until that number of clients is free again. (this is to avoid switching back and forth LIGHT and NORMAL mode: think of maxFree as the LIGHT mode exit value and of minFree as the LIGHT mode enter value): minFree=0.7*maxFree=7 [again, default setting is pessimistic] Now, reamins one setting concerning the way clients are mapped to threads. Since 1.0alpha5 this uses a thread cache which has two config parameters: - maxThreads: max number of threads - idleTimeout: time to keep exceeding threads a live The pool of threads initally creates maxThreads/2 threads. If more threads have to be created, they will stay alive after usage, only for idleTimeout ms. maxThreads, as you can see only controls the initial size of the thread pool (it's get 40 in the default config, so that we start with 20 threads). Threads will always be created when needed (l-620). I think idleTimeout should be fearly large (at least a few seconds), because peaks tend to last for "long"... > Currently, we are running benchmarks with different values for all those > parameters so that we could possible derive tuning rules. > An explaination of the rationale behind the algorithms coded in Jigsaw > would help a lot. I am referring to the changing priority of the thread > accepting > sockets, the killing of clients and the server state changes. I hope the above (long) explanation helps. If you find a better way to set the parameters, or any enhancements to the way this piece is done, let me know. > To finish, could you clarify the following phenomenon. > Upon certain circumstances the stresser in httpd does not return when > Jigsaw gets > overloaded. I mean certain threads never join. We observed the same for > Stresser (in java). This is strange considering than Jigsaw seems to > recover in the meantime > from the overload. I have observed that too, I'll check the problem again. Anselm.
Received on Thursday, 27 February 1997 03:28:45 UTC