W3C home > Mailing lists > Public > www-lib@w3.org > October to December 2000

Simple showtext question

From: Sajit Prabhakaran <sajitk@home.com>
Date: Sun, 19 Nov 2000 22:37:18 -0500 (EST)
Message-ID: <000a01c052a3$a3d0f060$b458b218@moline1.il.home.com>
To: <www-lib@w3.org>

I'm a newbie to libwww. 

My requirements are basic: I need to download HTML files and parse out the plain text. My attempts at doing this met with only partial success in the past. Libwww looked god-sent to me.

However: my experiments with  the "showtext" sample has not yielded much success. It always messes up with <script..> </script> tag, spitting out the script as part of the text.

(Sites I tried: www.msn.com, www.cnn.com, www.rens.com). 

Any help  would be sincerely appreciated...in anticipation of blissful parsing....

Sajit Prabhakaran

Received on Monday, 20 November 2000 02:48:05 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:33:53 UTC