W3C home > Mailing lists > Public > www-lib@w3.org > July to September 1999

Re: Robot crash

From: John Punin <puninj@cs.rpi.edu>
Date: Fri, 30 Jul 1999 14:00:19 -0400 (EDT)
Message-Id: <199907301800.OAA05760@dishwasher.cs.rpi.edu>
To: guy.ferran@ardentsoftware.fr (Guy Ferran)
Cc: puninj@cs.rpi.edu (John Punin), www-lib@w3.org
> 
> 
> John,
> 
> Your proposal does not really solve my problem, since my goal is to
> visit more than the prefixed sites you suggest.
> 

Hi Guy

I'm sorry that my suggestions didn't help you. In my view, webbot has been
designed to visit small number of websites. I use it just to visit one web
site. I run webbot to visit xmltree and I didn't have any problems. 
I'm using Solaris. 
 
> I wonder if it is really due to the number of sites i visited, since the
> crash occurs rather rapidly.

I'm totally agree with you. It shouldn't crash quickly. Could you let me know
approximately how long webbot run before it crashes?. 

> 
> By the way, does it mean that webbot has been designed just to support a
> small set of visiting sites?
>

Yes and no. Henrik's quote:
"Be careful - this is a robot and hence can be used to traverse many links - it should be used with care
and is not designed to be let loose on the Internet at large. Its primary design goal was to be able to test
HTTP/1.1 pipelining features."
 
> Is it just a matter of memory consumption, which could then be solved by
> a kind of memory-map mechanism to swap the memory to a file, or is the
> problem more fundamental ?

I am afraid that there is a bug in the library and we should kill it :-). 

> 
> Besides, i do not understand your suggestion about "robot.txt". I
> thought "robot.txt" can only be defined on server sites, and thus webbot
> which act s as a client can just rely on his presence. That's why i've
> put these sites explicitly in the -exclude clause.

I thought that you had access to xmltree.com website and could create a
robots.txt file for that site. /History, /team directories belong to www.w3.org
You are completely right to include them in webbot command line.
  
> 
> Thanks,
> 
> Guy.
> 
> PS: I tried "purify", to check memory at runtime,  but it seems the
> version I have from Rational (solaris 2.7) does not support the
> libraries generated by gcc.
>
 
I have the same problem. I can not use purify any more. :-(

If I find a solution to your problem, I will let you know. 

Best
John
Received on Friday, 30 July 1999 14:00:26 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:15:17 UTC