Re: NCSA server performance patch

   From: Russel Kipp Jones <kipp@cc.gatech.edu>

   We had been experiencing considerable delay on our server connections.
   Discovered that it was the server was doing an initgroups call for
   each fork'd process.  As we use yp, and our group file keeps growing,
   AND yp is single threaded, all of the accesses were getting queued
   up waiting for ypserv.

   We tweaked the code to allow us to only do the initgroups
   call once, and use that information the remaining times.  As the
   uid/gid is always the same, this is sufficient.

FWIW, there's more in the way of speed increases where that came from,
and some of them are fairly easy to arrange for.  On each connection,
the NCSA server:

*) Talks to the nameserver --- yet another opportunity for YP to
   serialize you.  I'm not sure how to fix this *portably*, but
   cacheing the hostnames of recently seen clients in shared memory
   eases things somewhat.  Compiling with -DMINIMAL_DNS keeps the
   server from talking to the nameserver at all, and is a simpler and
   better option for those who can live with it.

*) Tries to open a whole lot of .htaccess files which aren't there.
   People running close to the edge can get around this by turning off
   the .htaccess checks entirely with an AllowOverride None at the
   right spot in access.conf.  This may be a substantial win for those
   afflicted with AFS.  (The checks for symlinks at every directory
   are also a potential source of overhead, though in that case,
   things may be better simply because the directories in question
   actually exist, and so kernel machinery like the namei cache works).

There are a few more obvious speed improvements which are harder to
arrange for (you have to change some of the server code), but the
payoffs, at least for the first listed hack below, are substantial:

*) Reads the request and MIME header from the client character by
   character, taking a context-switch into and out of the kernel on
   each.  This is a MAJOR performance hit, and easy to kludge around,
   but you do have to change the server code.  It's only mildly hard
   to fix right.

*) Opens the locale database to find out the names of the months, and
   opens some other file to find the time zone.  (Actually, the C
   library does this behind httpd's back, but the effect is the same).

   I got rid of this overhead by doing a few dummy time conversions
   before starting to listen on the socket --- this initializes the C
   library time-conversion code in the parent process, and so the
   children don't have to do it themselves after the fork().

I've fixed most of the above in the server I'm running (all except
.htaccess files, which require some code cleanup to get right), and
that gets you close to the end of the line --- much improvement beyond
that will probably come only by eliminating the fork on every
transaction.  (The overhead of fork() is difficult to measure directly,
but it shows up indirectly in some of my other measurements, and it
seems to be large).

That's only after some cleanups, though --- the standard NCSA server
spends most of its time figuring out what groups "nobody" is in, over
and over and over...

rst

PS --- your patch has a *very* minor bug --- if the server rereads the
       config files, it doesn't change the group info, even though
       User might have changed in httpd.conf, and the appropriate
       groups might have changed in any event.  This is never likely
       to come up in practice, but I'm a little compulsive about these
       things.

[1] If you really want to do this right, at least on SunOS, you can
    recv the header with MSG_PEEK instead of reading it, and then only
    read those bytes out of the kernel buffers which actually contain
    the header, leaving the rest for a CGI script.  This handles POST
    right, as well as GET.  David Robinson came up with this idea and
    has actually coded it up.  Or, you could do as the CERN server does
    --- buffer the client socket as usual, and then pipe the contents
    of the buffer to any script that wants to see them, but that's more
    work starting from the existing NCSA code.

Received on Monday, 27 February 1995 10:34:38 UTC