- From: Allen Comer <allen.comer@entropic.com>
- Date: Wed, 16 Jun 1999 22:06:24 -0400 (EDT)
- To: "'www-lib@w3.org'" <www-lib@w3.org>
Hi all, I'm trying to grab some web-based information. Getting at the data is a multi-link operation involving permanent and temporary cookie handling. My basic approach is: (1) load the home page; (2) look for appropriate links (using the HTML parser findLink() callback) and, when found, launch a new request; (3) use other components of the HTML parser to extract the data elements of interest; (4) repeat (2) and (3) until all of the info has been obtained. The problem I run into is that for one particular data element, the HTML that's returned is corrupted in that the data in which I'm interested (simple text) ends up scrambled with the attributes and values of a <table> element. I've tried different arguments to the HTRequest_setOutputFormat() function but the returned data is corrupted regardless. If I grab the source from the Internet Explorer browser and parse the local file, I have no problems. Any suggestions? Thanks, Allen C.
Received on Thursday, 17 June 1999 01:38:19 UTC