- From: Dave Beckett <dave.beckett@bristol.ac.uk>
- Date: Wed, 15 Jan 2003 17:00:56 +0000
- To: James Michael DuPont <mdupont777@yahoo.com>
- cc: www-rdf-interest@w3.org
>>>James Michael DuPont said: > > > --- Dave Beckett <dave.beckett@bristol.ac.uk> wrote: > > BTW, I've already updated that on the chat logs and rdfig chump > > page too. I actually meant to say, that I updated the chat logs area and Edd Dumbill who hosts the http://rdfig.xmlhack.com/ site for the RDF interest group, did the chump pages. > I will look into setting up a daily snapshot of the rdf logs in the > next months. At least to keep a weeks worth of rdf and rotate it should > be fine. I'm puzzled why you need to do that. The RDF/XML logs are live on the web. If you want to copy them, take the one new file each day and do something with it. In fact, I'd rather you didn't try to make an alternative site. > to be honest, i think the best would be to push/mirror them onto a > server that does not have this restriction of robots, i dont understand > why it works at home but not at work. (to be honest, i dont understand > this error message at all) Because the site was hit by agressive web robot client that tried to download all the chatlogs every hour, several requests per second. When it reached 70% of our entire site's requests and refused to respect robots.txt, I banned the client and the IP addresses. We are installing a new server, but I'm sure I don't want it sucked up with these requests either. There is only one new file each day in RDF, text and HTML respectively. Mirroring would take a few seconds. Dave
Received on Wednesday, 15 January 2003 12:03:51 UTC