Bug - but where? from Allen Comer on 1999-06-17 (www-lib@w3.org from April to June 1999)

From: Allen Comer <allen.comer@entropic.com>
Date: Wed, 16 Jun 1999 22:06:24 -0400 (EDT)
To: "'www-lib@w3.org'" <www-lib@w3.org>
Message-ID: <F18326290662D211921B00105A29876781DC@sheik.entropic.com>

Hi all,

I'm trying to grab some web-based information.  Getting at the data is a
multi-link operation involving permanent and temporary cookie handling.
My basic approach is:

(1) load the home page;
(2) look for appropriate links (using the HTML parser findLink()
callback) and, when found, launch a new request;
(3) use other components of the HTML parser to extract the data elements
of interest;
(4) repeat (2) and (3) until all of the info has been obtained.

The problem I run into is that for one particular data element, the HTML
that's returned is corrupted in that the data in which I'm interested
(simple text) ends up scrambled with the attributes and values of a
<table> element.  I've tried different arguments to the
HTRequest_setOutputFormat() function but the returned data is corrupted
regardless.  

If I grab the source from the Internet Explorer browser and parse the
local file, I have no problems.  

Any suggestions?

Thanks,

Allen C.

Received on Thursday, 17 June 1999 01:38:19 UTC