Re: Feeding the webbot with a list of addresses to check from Bob Racko on 1998-09-30 (www-lib@w3.org from July to September 1998)

From: Bob Racko <bobr@dprc.net>
Date: Tue, 29 Sep 1998 21:57:56 -0400
To: www-lib@w3.org
Cc: "Lola Garcia Santiago" <mdolo@platon.ugr.es>
Message-Id: <3.0.3.32.19980929215756.035c7270@shell14.ba.best.com>

mostly when I want to have the robot check
multiple URI's I run it multiple times
from a script or bat file.

I suppose we can change the interpretation
of the second and subsequent arguments to mean
additional URI's. Right now they mean keywords to search for.

I personally would prefer (yet another) command line
option along the lines of -addURI *
which would add the named link to the list of initial documents to load.
(The same as finding an <A href="*" > in a document except this link
would name an alternate root or starting-point).

This begs the issue of return-status. What does it mean to
try to fetch one or more documents and they fail to fetch
(or fail to parse, or fail to... ) ?

When only one document-root is given (as it is now)
then the return status is wether the root and all children
came through ok. [at least that is what _I_ find most useful
when building web spiders]

Would it be better to return "OK" if ALL of them fetch and
parse [AND] -or ok if any one of them comes out ok [OR]?

For instance, I have a growing list of web-sites and pages within sites
that I want to monitor to see when they come online.
So I want to construct a robot that tells me when
any one of them is "visible". (ping is not enough)
I also want a robot that tells me when all of them are healthy.

The inverse is also useful. Suppose I want to know
when a page I am interested in is no longer available?

At 06:09 AM 9/29/98 -0400, you wrote:
>At 08:10 9/26/98 -0400, Lola Garcia Santiago wrote:
>>Dear Sirs,	
>>
>>	I've looked some informations about Webbot but I'm interested to know if
>>Webbot works with one URI but I don't know if it
>>works using a pre-generated list of URIs. In case Webbot couldn't use that
>>list, do you know any robot or mapper that can check websites using a
>>previous list of URIs?.
>>
>>		Thank you for your answer,
>>
>>Lola Garcia Santiago
>>Doctoral Student
>>Universidad de Granada
>>Spain	
>>e-mail: mdolo@platon.ugr.es 
>
>This actually sounds like a good idea. Bob Racko and John Punin are doing
>some wonderful work on the webbot now but I think we should put this on the
>wish list.
>
>What do you think guys?
>
>Henrik
>
>--
>Henrik Frystyk Nielsen,
>World Wide Web Consortium
>http://www.w3.org/People/Frystyk
>
>
>

{-----}
bobr@dprc.net

Received on Tuesday, 29 September 1998 22:10:22 UTC