- From: Ville Skyttä <ville.skytta@iki.fi>
- Date: Mon, 03 Apr 2006 21:42:05 +0300
- To: QA-dev <public-qa-dev@w3.org>
On Mon, 2006-04-03 at 14:55 +0900, olivier Thereaux wrote: > On 3 Apr 2006, at 04:13, Ville Skyttä wrote: > > but also because it does > > not actually sleep between requests to a host but does something weird > > instead > > ouch, you mean ParallelUserAgent does that? or is it something that > the current linkchecker code does wrong in this regard? The former, around line 286 in LWP::Parallel::RobotUA: if ($self->{'use_sleep'}) { # well, we don't really use sleep, but lets emulate # the standard LWP behavior as closely as possible... It does manage to wait between requests some other way though. But quickly observing the CPU usage it looks like a busy loop somewhere. > If a browser-based widget (either ajax or proprietary browser plugin) > were to do link checking today, I don't really expect that there > would be protests to get them to follow robots.txt. Avoid slamming > remote servers, probably, but respect Disallow: etc., probably not. > The more I think of this, the more I look at your "ack" [1] with > interest, and think it could/should be the replacement for the web- > based link checker (while still distributing the older as perl module/ > command-line tool). I've made some tiny local improvements to that hack in the meantime, will put a new version online to qa-dev soon.
Received on Monday, 3 April 2006 18:42:09 UTC