W3C home > Mailing lists > Public > public-qa-dev@w3.org > June 2004

Re: checklink: 3.9.3 beta, feedback requests

From: Ville Skyttä <ville.skytta@iki.fi>
Date: Wed, 09 Jun 2004 01:25:33 +0300
To: QA Dev <public-qa-dev@w3.org>
Message-Id: <1086733532.29458.156.camel@bobcat.mine.nu>

On Wed, 2004-06-09 at 01:06, Bjoern Hoehrmann wrote:
> * Ville Skyttä wrote:
> >Regarding the beta announcement, I would like to request feedback in
> >particular of the following (feel free to rephrase, comment, and add
> >items):
> >
> >- Robots exclusion standard support in the link checker.  Good or bad?
> 
> It would be good to announce in what way it supports it; as discussed,
> there are apparently several approaches, like do you read the submitted
> document if the robots.txt of that server does not allow it or will it
> refuse to check any link in that case (as it does not download anything
> but robots.txt). Without such information one would have to figure this
> out by testing or looking at the code which is unlikely to yield in
> much feedback.

The current behaviour is the "blunt" one, ie. nothing except /robots.txt
is fetched if /robots.txt disallows it.  Oh, and the version of the
supported exclusion standard is the "original 1994 one" (for now, as
that's what the current LWP::RobotUA supports), and <meta name="robots">
is not supported at all in this version.

This info should be included in the docs, BTW, not (only) the beta
announcement.  Will take a look at improving the documentation tomorrow
unless someone beats me to it... ;)
Received on Tuesday, 8 June 2004 18:25:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 19 August 2010 18:12:44 GMT