W3C home > Mailing lists > Public > www-validator@w3.org > May 2010

Suggestion/Enhancement request: Please add to FAQ: list of User-agents for robots.txt files

From: AARF Communications <groundlizard-aarfcommunications@yahoo.com>
Date: Sat, 29 May 2010 15:57:06 -0400
To: <www-validator@w3.org>
Message-ID: <C826E9D2.2D6C2%groundlizard-aarfcommunications@yahoo.com>
I was using the link checker on my site, and it failed because I fiddled
with my robots.txt file to make it have a white list of approved user
agents, and deny all others. The error was:

Forbidden by robots.txt

This makes sense, (and proved my robots.txt file works, thanks), so I
started looking around and it appears there are multiple W3C user agents out
there. 

For now, I will change my file to allow all bots, and can try some trial and
error to see which user name the link checker uses.

My suggestion, then, is for you to add a FAQ about this issue -- and in the
FAQ -- _solve_ the user's problem by explicitly listing the User-agent:
names that people should put in their robots files.

Thanks,

Mark

(That is, please don't create a FAQ that leads to a answer like the one
below. It is, as people might say, somewhat "content limited". The person
answering (I removed the names so as not to call anyone out) seems to have
known the answer since he quoted the W3C Validator; he well could have given
poor FIRSTNAME the answer.)


Hi FIRSTNAME,

> Message du xx/yy/10 15:15
> De : "LASTNAME, FIRSTNAME"
> 
> This was working fine before, but all of a sudden I keep getting this error
403 robots.txt. Please advise how I can get this to work again?


Your robots.txt file is probably forbidding access to robots (including the
W3C Validator). You may check this site http://www.robotstxt.org/ to learn
how to modify robots.txt to avoid this problem.

ANSWERINGPERSON
Received on Monday, 31 May 2010 13:02:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:41 GMT