- From: Centaur zeus <perseus_medusa@hotmail.com>
- Date: Fri, 25 Jul 2003 08:16:42 +0000
- To: www-validator@w3.org
- Cc: perseus_medusa@hotmail.com
Hi all ,
I am using the linkChecker v 3.6.2.3 and find that it posed quite some
CPU percentage usage on the server (about 1.5~ 2.0 %). THough it's not a
large percentage but if I planned to use it for 20 concurrent users than it
just multiplied. The document I am testing on tried to fetch 61 links.
I profiled the code and here is the result :
12.9 0.130 0.124 1079 0.0001 0.0001 HTTP::Headers::_header
8.94 0.090 0.128 773 0.0001 0.0002 W3C::CheckLink::start
8.64 0.087 0.265 8 0.0109 0.0332 HTML::Parser::parse
6.95 0.070 0.129 10 0.0070 0.0129 LWP::UserAgent::BEGIN
5.86 0.059 0.388 55 0.0011 0.0071 LWP::Protocol::http::request
3.97 0.040 0.037 292 0.0001 0.0001 URI::_init
3.97 0.040 0.091 641 0.0001 0.0001 HTTP::Headers::header
...
I found that it actually parsed two documents, one is the one I requested
and another is the one of the html link. So i edit the code and changed
if (being_processed)
to
if (0)
to skip the code
And then, I get the following results :
8.35 0.088 0.405 55 0.0016 0.0074 LWP::Protocol::http::request
5.70 0.060 0.055 1076 0.0001 0.0001 HTTP::Headers::_header
4.75 0.050 0.049 241 0.0002 0.0002 URI::implementor
4.75 0.050 0.119 10 0.0050 0.0119 LWP::UserAgent::BEGIN
4.37 0.046 0.112 7 0.0066 0.0161 HTML::Parser::parse
3.80 0.040 0.092 798 0.0001 0.0001 HTTP::Message::AUTOLOAD
So I want to ask :
1) Why the html link is parsed again ?
2) is it appropriate to change if (being_processed) to if (0) and what's the
impact ?
3) How can I minimize the resource used by the LWP and HTTP package ?
Thanks.
Perseus
_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE*
http://join.msn.com/?page=features/virus
Received on Friday, 25 July 2003 04:38:13 UTC