W3C home > Mailing lists > Public > www-lib@w3.org > January to March 2000

Re: Downloading with the ROBOT

From: olivier lefebvre <lefebvreolivier@hotmail.com>
Date: Wed, 02 Feb 2000 13:21:49 AST
Message-ID: <20000202172149.15952.qmail@hotmail.com>
To: rking@aignetplex.com, www-lib@w3.org
Hi,

I had the same problem, but even a worst one because not each file was 
complete, some were of zero size, other not.

Finally I did not managed to download each file browsed with webbot, another 
member of the list told me to do that with an after Filter occurring with 
the HT_OK state of each request but I still didn't try.

Another people have an idea ??

Thaks in advance
Olivier

>From: "Ron King" <rking@aignetplex.com>
>To: "olivier lefebvre" <lefebvreolivier@hotmail.com>
>Subject: Re:  Downloading with the ROBOT
>Date: Wed, 26 Jan 2000 07:18:44 -0800
>Received: from [207.138.41.189] by hotmail.com (3.2) with ESMTP id 
>MHotMailBA585C150090D82197A4CF8A29BD79160; Wed Jan 26 07:21:57 2000
>Received: from [10.1.2.15] by b05.egroups.com with NNFMP; 26 Jan 2000 
>15:18:46 -0000
>From rking@aignetplex.com Wed Jan 26 07:23:02 2000
>X-eGroups-Return: rking@aignetplex.com
>Message-ID: <86n38k$aij8@eGroups.com>
>In-Reply-To:  <20000118140234.89497.qmail@hotmail.com>
>User-Agent: eGroups-EW/0.82
>Content-Length: 2408
>X-Mailer: eGroups Message Poster
>
>"olivier lefebvre" <lefebvreolivie-@hotmail.com> wrote:
>original article:http://www.egroups.com/group/www-lib/?start=2118
> > Hi,
> >
> > I wanted to use the Robot program to download each file accessed
> > with it, I added my program into the function RHText_foundLink
> > but The result I only get is a Core Dump (i'm using gcc under
> > Sun Solaris 2.7), this is the only thing I added to practice
> > but didn't manage in another way....
> >
> > Could someone explain me how to do that ?  In the example I
> > save into the file "file", after the second LoadToFile I'm asked to
> > overwrite or not, but even the first writing i get a 0 file
> > size...
> >
> > Thanks in advance.
> >
> > Olivier
> >
> > P.S. Here is the function I modified in attachement
> > ______________________________________________________
> > Get Your Private, Free Email at http://www.hotmail.com
> >
> > PRIVATE void RHText_foundLink (HText * text,
> > 			       int element_number, int attribute_number,
> > 			       HTChildAnchor * anchor,
> > 			       const BOOL * present, const char ** value)
> > {
> >     if (text && anchor) {
> > 	Finger * finger = (Finger *) HTRequest_context(text->request);
> > 	Robot * mr = finger->robot;
> >
> > #ifdef DEV_SAVETODISK
> > 	/* Which URL to save to file */
> >         char * uri_tosave = HTAnchor_address(
> > HTAnchor_followMainLink((HTAnchor *)anchor));
> >
> > 	/* Creation of my new Finger */
> >         Finger * finger_tosave = Finger_new(mr, (HTParentAnchor*)anch
>or,
> > METHOD_GET);
> >
> > 	/* Downloading of my URL content into the file "file" */
> >         if(!HTLoadToFile(uri_tosave, finger_tosave->request , "file"))
> >           HTPrint(" Error while loading to file");
> >
> >         HT_FREE(uri_tosave);
> > #endif // DEV_SAVETODISK
> >
> > 	if (SHOW_QUIET(mr))
> > 	    HTPrint("Robot....... Received element %d, attribute %d with
>anchor
> > %p\n",
> > 		    element_number, attribute_number, anchor);
> > 	if ((element_number==HTML_IMG && attribute_number==HTML_IMG_SRC) ||
> > 	    (element_number==HTML_BODY && attribute_number==HTML_BODY_BACKGR
>OUND)
> > ||
> > 	    (element_number==HTML_INPUT && attribute_number==HTML_INPUT_SRC))
> > 	    RHText_foundImage(text, anchor, NULL, NULL, NO);
> > 	else
> > 	    RHText_foundAnchor(text, anchor);
> >     }
> > }
>
>Hello Oliver,
>
>Did you ever figure this out? I want to do this too,
>and I get 0 file sizes! Please let me know how you
>solved this!
>
>Regards,
>
>Ron
>
>roncking@home.com or rking@aignetplex.com
>
>
>

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com
Received on Wednesday, 2 February 2000 12:22:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 April 2007 18:18:35 GMT