W3C home > Mailing lists > Public > www-validator@w3.org > August 2013

Re: checklink:

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Sun, 25 Aug 2013 21:27:30 +0300
Message-ID: <521A4C92.7090907@cs.tut.fi>
To: "Andresen, Nancy - DOT" <nancy.andresen@dot.wi.gov>
CC: "'www-validator@w3.org'" <www-validator@w3.org>
2013-08-22 21:50, Andresen, Nancy - DOT wrote:

> I used to use your service all the time, but today I get a “forbidden by
> Robots” error for every page I try.  For example:
> _http://www.dot.wisconsin.gov/news/law/index.htm_
> Can you tell me what has changed?

The exact message is "Error: 403 Forbidden by robots.txt". The reason is 
that the robots.txt resource on your server,
http://www.dot.wisconsin.gov/robots.txt
disallows all robots, and the Link Checker regards itself as a robot and 
honors the Robots Exclusion Standard.

See http://validator.w3.org/docs/checklink.html#bot

Note that the robots.txt file also affects Google. If you e.g. Google 
with the page URL
http://www.dot.wisconsin.gov/news/law/index.htm
the top search result will have no description to show; instead it has 
the explanation "A description for this result is not available because 
of this site's robots.txt".

So consider contacting your site admin asking them to reconsider the 
usefulness of excluding all robots.

Yucca
Received on Sunday, 25 August 2013 18:27:59 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:18:09 UTC