Re: [checklink] How to override robots.txt?

On Thursday 19 July 2007, Olivier Thereaux wrote:
> Hi Ville, Hi Stephan.
>
> On Wed, Jul 18, 2007, Ville Skyttä wrote:
> > On Tuesday 17 July 2007, Stephan Windmüller wrote:
> > > Is it possible to override or disable the robots.txt-feature?
> >
> > Not from the link checker side without modifying the code.  Perhaps the
> > target site administrators would be willing to allow the link checker to
> > crawl it?
>
> What do you think about the idea, though?
>
> Since -q is about the output of sole errors:
> -q, --quiet  No output if no errors are found (implies -s).
> and a link not checked because of a robots.txt directive is not per se
> an error (just informing the user that the link was not checked),
> shouldn't we modify checklink there?

Good point, some changes are needed either in the documentation or the code.  
No particularly strong opinions, but perhaps changing the code to match the 
docs would make more sense in this case (even though it surely is much more 
painful than changing the docs ;)).

By the way, in addition to robots stuff, -q does not currently result in no 
output for cases that contain 302 -> 200 redirects only either.  There may be 
more cases that need similar treatment if we want to go that way.

Received on Thursday, 19 July 2007 16:58:02 UTC