<italic>Some recent experiences made me remember this proposal. At the time it came out the subject seemed far too distant to talk about, and there was no response on the mailing list.</italic> Mordecai T. Abzug wrote: > It would be really nice if there were a response code (say, 405) for > "robot forbidden that URL." Technically, "forbidden" is already covered > through 403, but it would still be nice to have something more > descriptive. <smaller>Actual number would have to be at least 413, having in mind HTTP/1.1 draft and "412 Gone" response code (but this is not the issue).</smaller> There <bold>are</bold> lots of documents, generated by cgi-scripts or of other origin, not appropriate for indexing by WWW robots. Or we may not want them to be indexed (like, you make an CGI to list files in a large database [my experience RFC repository] and then robot picks avery documents and indexes it from the result of that CGI script, which was not what was ment to happen). This leads to following: <bold>Shouldn't there be a way to specify which URL's we want to be indexed, and which we do not want to be indexed?</bold> Question(s): how can the server now whether it is accessed by a browser or by a robot? He could analyze <bold>User-agent:</bold> field in header, but won't there be new robots which weren't existing while the server was configured? -- Mirsad Todorovac Faculty of Electrical Engineering and Computing, University of Zagreb, Croatia mirsad.todorovac@fer.hrReceived on Monday, 19 February 1996 00:54:57 EST
This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 06:31:45 EDT