W3C home > Mailing lists > Public > ietf-http-wg-old@w3.org > January to April 1996

Re: New response code

From: Mirsad Todorovac <tm@rasips2.rasip.etf.hr>
Date: Mon, 19 Feb 1996 09:43:14 +0100 (MET)
Message-Id: <199602190843.JAA28662@rasips2.rasip.etf.hr>
To: "Mordechai T. Abzug" <mabzug1@gl.umbc.edu>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com

<italic>Some recent experiences made me remember this proposal.  At the time it
came out the subject seemed far too distant to talk about, and there was no
response on the mailing list.</italic>


Mordecai T. Abzug wrote:

> It would be really nice if there were a response code (say, 405) for

> "robot forbidden that URL."  Technically, "forbidden" is already covered

> through 403, but it would still be nice to have something more

> descriptive.


<smaller>Actual number would have to be at least 413, having in mind HTTP/1.1
draft and "412 Gone" response code (but this is not the issue).</smaller>


There <bold>are</bold> lots of documents, generated by cgi-scripts or of other
origin, not appropriate for indexing by WWW robots.  Or we may not want them
to be indexed (like, you make an CGI to list files in a large database [my
experience RFC repository] and then robot picks avery documents and indexes
it from the result of that CGI script, which was not what was ment to happen).


This leads to following:

<bold>Shouldn't there be a way to specify which URL's we want to be indexed,
and which we do not want to be indexed?</bold>


Question(s):

how can the server now whether it is accessed by a browser or
by a robot?  He could analyze <bold>User-agent:</bold> field in header, but
won't there be new robots which weren't existing while the server was
configured?


-- 
Mirsad Todorovac
Faculty of Electrical Engineering and Computing, University of Zagreb, Croatia
mirsad.todorovac@fer.hr
Received on Monday, 19 February 1996 00:54:57 EST

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 06:31:45 EDT