- From: Dieter Fensel <dieter.fensel@sti2.at>
- Date: Tue, 21 Jun 2011 20:22:16 +0200
- To: Andreas Harth <harth@kit.edu>,public-lod@w3.org
-1. Obviously it is not useful to kill the web server of small shops due to academic experiments. At 02:29 PM 6/21/2011, Andreas Harth wrote: >Dear Martin, > >I agree with you in that software accessing large portions of the web >should adhere to basic principles (such as robots.txt). > >However, I wonder why you publish large datasets and then complain when >people actually use the data. > >If you provide a site with millions of triples your infrastructure should >scale beyond "I have clicked on a few links and the server seems to be >doing something". You should set HTTP expires header to leverage the widely >deployed HTTP caches. You should have stable URIs. Also, you should >configure your servers to shield them from both mad crawlers and DOS >attacks (see e.g., [1]). > >Publishing millions of triples is slightly more complex than publishing your >personal homepage. > >Best regards, >Andreas. > >[1] http://code.google.com/p/ldspider/wiki/ServerConfig -- Dieter Fensel Director STI Innsbruck, University of Innsbruck, Austria http://www.sti-innsbruck.at/ phone: +43-512-507-6488/5, fax: +43-512-507-9872
Received on Tuesday, 21 June 2011 18:23:50 UTC