Re: [ANN] W3C Link Checker - version 3.9.2 - first standalone release

On Sun, 2004-04-11 at 21:12, Matthew Wilson wrote:
> > Does the Link Checker obey robots.txt?
> > 
> > 
> >    "...the W3C has a link checker. It announces itself as
> >    W3C-checklink/3.9.2 [3.17] libwww-perl/5.64. It does not respect
> >    robots.txt"
> 
> It seems to me that changing LWP::UserAgent to LWP::RobotUA in 
> 'checklink' should be enough to fix this.

Sort of, yes.  But that change also causes a bunch of unwelcome side
effects to for example the link checker's redirect tracking logic and
the results UI.  And there are quite a few bugs in the LWP::RobotUA and
WWW::RobotRules code, most of which have been fixed in libwww-perl 5.77
(by Liam Quinn), and some fixes (by yours truly and Gisle Aas) are still
in upstream CVS pending for the next release.

Anyway, the "robotization" of the link checker is in progress and looks
pretty good at the moment; the next version will include this stuff. 
Stay tuned...

Received on Sunday, 11 April 2004 15:10:04 UTC