- From: Howard Rubin <hrubin@nyx.net>
- Date: Fri, 13 Nov 1998 16:31:23 -0700 (MST)
- To: hrubin@disc.com, www-amaya@w3.org
I need to extract text from HTML documents and do this from a platform portable C program. I've been all over the web -- dejanews, yahoo etc., and the closest thing I've found is libwww. However, I notice in libwww (http://www.w3.org/Library/User/Start.html) that libwww isn't recommended for use as an HTML parser. It recommends Amaya as a full HTML parser. Is there some part of the Amaya source code that would be suitable for extracting the text from HTML documents from a C program? Any tips, hints, etc would be greatly appreciated. TIA, Howard Rubin
Received on Friday, 13 November 1998 18:31:31 UTC