- From: Ville Skyttä <ville.skytta@iki.fi>
- Date: Mon, 03 Apr 2006 21:42:05 +0300
- To: QA-dev <public-qa-dev@w3.org>
On Mon, 2006-04-03 at 14:55 +0900, olivier Thereaux wrote:
> On 3 Apr 2006, at 04:13, Ville Skyttä wrote:
> > but also because it does
> > not actually sleep between requests to a host but does something weird
> > instead
>
> ouch, you mean ParallelUserAgent does that? or is it something that
> the current linkchecker code does wrong in this regard?
The former, around line 286 in LWP::Parallel::RobotUA:
if ($self->{'use_sleep'}) {
# well, we don't really use sleep, but lets emulate
# the standard LWP behavior as closely as possible...
It does manage to wait between requests some other way though. But
quickly observing the CPU usage it looks like a busy loop somewhere.
> If a browser-based widget (either ajax or proprietary browser plugin)
> were to do link checking today, I don't really expect that there
> would be protests to get them to follow robots.txt. Avoid slamming
> remote servers, probably, but respect Disallow: etc., probably not.
> The more I think of this, the more I look at your "ack" [1] with
> interest, and think it could/should be the replacement for the web-
> based link checker (while still distributing the older as perl module/
> command-line tool).
I've made some tiny local improvements to that hack in the meantime,
will put a new version online to qa-dev soon.
Received on Monday, 3 April 2006 18:42:09 UTC