- From: Hans Ulrich Niedermann <ulrich@niedermann.bb.bawue.de>
- Date: Wed, 8 Mar 2000 13:29:15 -0500 (EST)
- To: www-validator@w3.org
Hi validators, I just recognised strange coincidences in the logs of my web server: In times with lots of search robot hits, also validator.w3.org wanted to validate my pages. THE PROBLEM =========== All the pages the web robots were visiting contain a link like <http://validator.w3.org/check?uri=http://www.bawue.de/~uli/;weblint;pw> (you could call this kind of a "validating link"). So I suspect the search robots visited that URL (repeatedly!). This behaviour is not contrary to the robots.txt file on validator.w3.org: -----8<------------------------------------------------------------ # # robots.txt for validator.w3.org # # $Id: robots.txt,v 1.2 1998/07/24 22:11:35 gerald Exp $ # # User-Agent: * # Disallow: -----8<------------------------------------------------------------ SOLUTION PROPOSAL ================= I think using User-Agent: * Disallow: /check would have the following advantages: 1. for validator.w3.org: less system load 2. for sites with "validating links": less system load, more accurate access counters 3. for the robots: they won't index pages nobody wants to find using a search engine I can't think of any disadvantages for any party. Critical annotations and replys are welcome. Regards, Uli
Received on Wednesday, 8 March 2000 16:37:23 UTC