- From: olivier Thereaux <ot@w3.org>
- Date: Mon, 27 Jun 2005 13:30:12 +0900
- To: Dominique Hazaël-Massieux <dom@w3.org>
- Cc: public-qa-dev@w3.org
Hi Dom, qa-dev,
On 22 Jun 2005, at 22:56, Dominique Hazaël-Massieux wrote:
> I've had a quick look at the linkchecker to see what would be
> needed to
> make it multithreaded; I see the linkchecker is using LWP::RobotUA.
> Has
> any thought being put to use LWP::Parallel::RobotUA [1] instead?
This is a really good idea, thanks! As Ville said, some such ideas
have been thrown around with the same goal, but our best bet so far
was to try and have parallel RobotUA instances, which would have been
problematic in many ways.
This looks promising, as it would certainly remove some of the
implementation concerns. Instead of having to track everything
ourselves, it seems that this LWP::Parallel::RobotUA can be given, at
any time new documents to process (by 'registering' new requests),
then you wait for some time, and fetch the results.
In particular I like these two options:
$ua->max_hosts ( $max )
Changes the maximum number of locations accessed in parallel.
The default value is 7.
$ua->max_req ( $max )
Changes the maximum number of requests issued per host in
parallel. The default value is 5.
I think this means we could greatly improve the speed of the link
checker by setting the latter to 1, and the former to... something
reasonably high.
Definitely worth playing with.
--
olivier
Received on Monday, 27 June 2005 04:30:06 UTC