- From: Antoine Zimmermann <antoine.zimmermann@gmail.com>
- Date: Thu, 23 Jun 2011 08:27:42 +0200
- To: Richard Cyganiak <richard@cyganiak.de>
- CC: Martin Hepp <martin.hepp@ebusiness-unibw.org>, public-lod@w3.org
Le 22/06/2011 23:49, Richard Cyganiak a écrit : > On 21 Jun 2011, at 10:44, Martin Hepp wrote: >> PS: I will not release the IP ranges from which the trouble >> originated, but rest assured, there were top research institutions >> among them. > > The right answer is: name and shame. That is the way to teach them. > > Like Karl said, we should collect information about abusive crawlers > so that site operators can defend themselves. It won't be *that* hard > to research and collect the IP ranges of offending universities. > > I started a list here: http://www.w3.org/wiki/Bad_Crawlers What's the use of this list? Assume it stays empty, as you hope. What's the use? Assume it gets filled with names: so what? It does not prove these crawlers are bad. The authors of the crawlers can just remove themselves from the list. If a crawler is on the list, chances are that nobody would notice anyway, especially not the kind of people that Martin is defending in his email. If a crawler is put to the list because it is bad and measures are taken, what happens when the crawler get fixed and become polite? And what if measures are taken while the crawler was not bad at all to start with? Surely, this list is utterly useless. Maybe you can keep the page to describe what are the problems that bad crawlers create and what are the measures that publishers can take to overcome problematic situation. AZ > > The list is currently empty. I hope it stays that way. > > Thank you all, Richard
Received on Thursday, 23 June 2011 06:28:14 UTC