Is amaya suitable for use as a html parsing library? from Howard Rubin on 1998-11-13 (www-amaya@w3.org from October to December 1998)

From: Howard Rubin <hrubin@nyx.net>
Date: Fri, 13 Nov 1998 16:31:23 -0700 (MST)
To: hrubin@disc.com, www-amaya@w3.org
Message-Id: <199811132331.QAA16147@nyx10.nyx.net>

I need to extract text from HTML documents and do this from
a platform portable C program. I've been all over the web --
dejanews, yahoo etc., and the closest thing I've found is libwww.
However, I notice in libwww (http://www.w3.org/Library/User/Start.html)
that libwww isn't recommended for use as an HTML parser. It
recommends Amaya as a full HTML parser.

Is there some part of the Amaya source code that would be suitable
for extracting the text from HTML documents from a C program?
Any tips, hints, etc would be greatly appreciated.

TIA, Howard Rubin

Received on Friday, 13 November 1998 18:31:31 UTC