- From: Gerald Oskoboiny <gerald@w3.org>
- Date: Wed, 1 Aug 2007 16:46:04 -0700
- To: olivier Thereaux <ot@w3.org>
- Cc: Ville Skytt? <ville.skytta@iki.fi>, QA-dev <public-qa-dev@w3.org>
* olivier Thereaux <ot@w3.org> [2007-08-01 18:35+0900] >Hi all, > >Just FYI, we have switch mod_perl2 off on all validator servers again. >Running under mod-perl seemed like a good idea, but there were some >issues we're having trouble explaining, like how the load remained >really, really high on the machines (apache2 processes using up a lot >of CPU) even when very few requests were being received. > >Switcing off mod_perl means more resource forks, and a slightly >slower validation process, but ultimately, less load and less wait, >it seems. Go figure. mod_perl may have been a red herring; I was blaming it for the massive apache2 process sizes (200 MB+ on jesssica) because I thought that was something you guys had changed recently (though I don't even know if that's true) and I had never seen apache processes that big. Also, 'check' is so expensive that I didn't really expect mod_perl to be a huge win; the bit of extra work to fire up a perl interpreter must be relatively cheap. But even after we pruned that and other stuff last night the apache2 process sizes are 120 MB, and the 'check' processes are 80-90 MB, so maybe we would be OK with mod_perl after all. >I still want to try some of the performance tweaks you suggested, >Ville [1] (avoiding copying content, undef-ing after use, etc). >Gerald also was suggesting looking at e.g BSD::Resource or any ulimit- >like system, to avoid having some "check" processes spin away and hog >CPU. Worth a shot. I think resource limits would help a lot. After our changes last night all the validator servers all seem pretty happy: 25 requests currently being processed, 103 idle workers -- jessica 18 requests currently being processed, 12 idle workers -- fugu 16 requests currently being processed, 10 idle workers -- lovejoy The biggest problem now seems to be a few URIs that consistently eat up many minutes of CPU time; currently on jessica there are several processes that have consumed 20+ cpu minutes each: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31664 www-data 25 0 88632 36m 10m R 29 0.9 22:14.48 check 32073 www-data 25 0 97820 36m 10m R 29 0.9 21:09.62 check 31862 www-data 25 0 131m 72m 10m R 27 1.8 21:26.18 check 31732 www-data 25 0 97824 36m 10m R 24 0.9 22:10.81 check It might be useful if 'check' told us what it was doing by adding lines like this throughout the code: $0 = "check: fetching $uri"; ... $0 = "check: sgml::parsing $uri"; ... $0 = "check: returning results for $uri"; so we can see what each process is up to in the output of 'ps' In the meantime you can see which URIs are responsible for these long-running check processes with: cat /proc/31664/environ | tr '\000' '\012' | grep QUERY_STRING (I would paste a few samples here but I think that would violate our privacy policy) >[1] http://lists.w3.org/Archives/Public/public-qa-dev/2007Jul/0022.html -- Gerald Oskoboiny http://www.w3.org/People/Gerald/ World Wide Web Consortium (W3C) http://www.w3.org/ tel:+1-604-906-1232 mailto:gerald@w3.org
Received on Wednesday, 1 August 2007 23:46:08 UTC