W3C home > Mailing lists > Public > www-jigsaw@w3.org > September to October 2002

Re: robots.txt? off-topic

From: Shantz <michaelshantz@attbi.com>
Date: Thu, 24 Oct 2002 13:51:31 -0400 (EDT)
Message-ID: <006501c27b86$92449f00$0501a8c0@Lindow>
To: "Jigsaw List \(E-mail\)" <www-jigsaw@w3.org>




Thanks to all for explaining this to me.
mike

----- Original Message ----- 
From: "Mudry Julien" <julien.mudry@elca.ch>
To: "'Shantz'" <michaelshantz@attbi.com>
Cc: "Jigsaw List (E-mail)" <www-jigsaw@w3.org>
Sent: Thursday, October 24, 2002 2:10 AM
Subject: RE: robots.txt? off-topic


> Hello
> 
> The robots.txt file allows a webmaster to exclude some
> pages or directories from browsing by webcrawlers. It's
> a standard called "Standard for Robot Exclusion". You
> can get more information regarding it here: 
> http://www.robotstxt.org/
> 
> Specifically, to answer your question:
> http://www.robotstxt.org/wc/faq.html#log
> 
> Regards,
> 
> Julien
> 
> > -----Original Message-----
> > From: Shantz [mailto:michaelshantz@attbi.com]
> > Sent: Thursday, October 24, 2002 10:57 AM
> > To: www-jigsaw@w3.org
> > Subject: robots.txt? off-topic
> > 
> > 
> > 
> > 
> > 
> > 
> > I've been using jigsaw to serve a webpage for a while.
> > When looking at the log, I often see what appears to be webcrawlers
> > doing a GET on robots.txt.  I have never had such a file.  Does anyone
> > know what this is about?
> > 
> > Mike
> > 
> >
Received on Thursday, 24 October 2002 13:53:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 9 April 2012 12:13:36 GMT