- From: Robert S. Thau <rst@ai.mit.edu>
- Date: Mon, 27 Feb 95 10:34:35 EST
- To: kipp@cc.gatech.edu
- Cc: www-talk@www19.w3.org
From: Russel Kipp Jones <kipp@cc.gatech.edu> We had been experiencing considerable delay on our server connections. Discovered that it was the server was doing an initgroups call for each fork'd process. As we use yp, and our group file keeps growing, AND yp is single threaded, all of the accesses were getting queued up waiting for ypserv. We tweaked the code to allow us to only do the initgroups call once, and use that information the remaining times. As the uid/gid is always the same, this is sufficient. FWIW, there's more in the way of speed increases where that came from, and some of them are fairly easy to arrange for. On each connection, the NCSA server: *) Talks to the nameserver --- yet another opportunity for YP to serialize you. I'm not sure how to fix this *portably*, but cacheing the hostnames of recently seen clients in shared memory eases things somewhat. Compiling with -DMINIMAL_DNS keeps the server from talking to the nameserver at all, and is a simpler and better option for those who can live with it. *) Tries to open a whole lot of .htaccess files which aren't there. People running close to the edge can get around this by turning off the .htaccess checks entirely with an AllowOverride None at the right spot in access.conf. This may be a substantial win for those afflicted with AFS. (The checks for symlinks at every directory are also a potential source of overhead, though in that case, things may be better simply because the directories in question actually exist, and so kernel machinery like the namei cache works). There are a few more obvious speed improvements which are harder to arrange for (you have to change some of the server code), but the payoffs, at least for the first listed hack below, are substantial: *) Reads the request and MIME header from the client character by character, taking a context-switch into and out of the kernel on each. This is a MAJOR performance hit, and easy to kludge around, but you do have to change the server code. It's only mildly hard to fix right. *) Opens the locale database to find out the names of the months, and opens some other file to find the time zone. (Actually, the C library does this behind httpd's back, but the effect is the same). I got rid of this overhead by doing a few dummy time conversions before starting to listen on the socket --- this initializes the C library time-conversion code in the parent process, and so the children don't have to do it themselves after the fork(). I've fixed most of the above in the server I'm running (all except .htaccess files, which require some code cleanup to get right), and that gets you close to the end of the line --- much improvement beyond that will probably come only by eliminating the fork on every transaction. (The overhead of fork() is difficult to measure directly, but it shows up indirectly in some of my other measurements, and it seems to be large). That's only after some cleanups, though --- the standard NCSA server spends most of its time figuring out what groups "nobody" is in, over and over and over... rst PS --- your patch has a *very* minor bug --- if the server rereads the config files, it doesn't change the group info, even though User might have changed in httpd.conf, and the appropriate groups might have changed in any event. This is never likely to come up in practice, but I'm a little compulsive about these things. [1] If you really want to do this right, at least on SunOS, you can recv the header with MSG_PEEK instead of reading it, and then only read those bytes out of the kernel buffers which actually contain the header, leaving the rest for a CGI script. This handles POST right, as well as GET. David Robinson came up with this idea and has actually coded it up. Or, you could do as the CERN server does --- buffer the client socket as usual, and then pipe the contents of the buffer to any script that wants to see them, but that's more work starting from the existing NCSA code.
Received on Monday, 27 February 1995 10:34:38 UTC