Simple showtext question from Sajit Prabhakaran on 2000-11-20 (www-lib@w3.org from October to December 2000)

From: Sajit Prabhakaran <sajitk@home.com>
Date: Sun, 19 Nov 2000 22:37:18 -0500 (EST)
To: <www-lib@w3.org>
Message-ID: <000a01c052a3$a3d0f060$b458b218@moline1.il.home.com>

Hi!

I'm a newbie to libwww. 

My requirements are basic: I need to download HTML files and parse out the plain text. My attempts at doing this met with only partial success in the past. Libwww looked god-sent to me.

However: my experiments with  the "showtext" sample has not yielded much success. It always messes up with <script..> </script> tag, spitting out the script as part of the text.

(Sites I tried: www.msn.com, www.cnn.com, www.rens.com). 

Any help  would be sincerely appreciated...in anticipation of blissful parsing....

Sajit Prabhakaran

Received on Monday, 20 November 2000 02:48:05 UTC