- From: Henry Story <henry.story@bblfish.net>
- Date: Thu, 23 Jun 2011 13:21:06 +0200
- To: Michael Brunnbauer <brunni@netestate.de>
- Cc: Kingsley Idehen <kidehen@openlinksw.com>, public-lod@w3.org
On 23 Jun 2011, at 13:13, Michael Brunnbauer wrote:
>
> On Thu, Jun 23, 2011 at 11:32:43AM +0100, Kingsley Idehen wrote:
>>> config = {
>>> 'Googlebot':['googlebot.com'],
>>> 'Mediapartners-Google':['googlebot.com'],
>>> 'msnbot':['live.com','msn.com','bing.com'],
>>> 'bingbot':['live.com','msn.com','bing.com'],
>>> 'Yahoo! Slurp':['yahoo.com','yahoo.net']
>>> }
>> How does that deal with a DoS query inadvertently or deliberately
>> generated by a SPARQL user agent?
>
> It's part of the solution. It prevents countermeasures hitting the crawlers
> that are welcome.
>
> How does WebID deal with it - except that it allows more fine grained ACLs per
> person/agent instead of DNS domain ? WebID is a cool thing and maybe crawlers
> will use it in the future but Martin needs solutions right now.
I'd emphase the above: it allows *Much* more fine grained ACLs. It's the difference between a police that would throw all gypsies into jail because it had some information leading them to think one gypsy stole something, and a police that would find the guilty person and just put him to jail.
Not only does it allow finer grained ACLs but it would allow agents to identify themselves: say as crawlers or end users. A Crawler could quickly be guided to the relevant dump file or RSS feeds, so that he does not need to waste resources on the server. It then allows the user/crawler to tie into linked data, which then means that we are applying recursively linked data to solve a linked data problem. That's the neat bit :-)
Henry
Social Web Architect
http://bblfish.net/
Received on Thursday, 23 June 2011 11:21:46 UTC