W3C home > Mailing lists > Public > public-lod@w3.org > June 2011

Re: Think before you write Semantic Web crawlers

From: Alexandre Passant <alexandre.passant@deri.org>
Date: Wed, 22 Jun 2011 23:11:57 +0100
Cc: Martin Hepp <martin.hepp@ebusiness-unibw.org>, public-lod@w3.org
Message-Id: <B98450B5-053C-476A-AA67-7ACC3F5E3885@deri.org>
To: Richard Cyganiak <richard@cyganiak.de>

On 22 Jun 2011, at 22:49, Richard Cyganiak wrote:

> On 21 Jun 2011, at 10:44, Martin Hepp wrote:
>> PS: I will not release the IP ranges from which the trouble originated, but rest assured, there were top research institutions among them.
> 
> The right answer is: name and shame. That is the way to teach them.

You may have find the right word: teach.
We've (as academic) given tutorials on how to publish and consume LOD, lots of things about best practices for publishing, but not much about consuming.
Why not simply coming with reasonable guidelines for this, that should also be taught in institutes / universities where people use LOD, and in tutorials given in various conferences.

m2c

Alex.

> 
> Like Karl said, we should collect information about abusive crawlers so that site operators can defend themselves. It won't be *that* hard to research and collect the IP ranges of offending universities.
> 
> I started a list here:
> http://www.w3.org/wiki/Bad_Crawlers
> 
> The list is currently empty. I hope it stays that way.
> 
> Thank you all,
> Richard

--
Dr. Alexandre Passant, 
Social Software Unit Leader
Digital Enterprise Research Institute, 
National University of Ireland, Galway
Received on Wednesday, 22 June 2011 22:12:31 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:29:54 UTC