Amaya HTML parser. from Thaddeus L. Olczyk on 2004-06-30 (www-amaya-dev@w3.org from June 2004)

From: Thaddeus L. Olczyk <olczyk@interaccess.com>
Date: Wed, 30 Jun 2004 05:59:15 -0500
To: www-amaya-dev@w3.org
Message-id: <1q65e09cphfhnl3uaounm89pbucs06o5f5@4ax.com>

Hi.
I've been going nuts looking for a non-perl HTML parser
which handles "real world" HTML. On the libwww page,
it says that their parser is primitive and if you are looking
for a robust HTML parser, look at Amaya.

So I've gotten Amaya. I've skinned through the documentation.
It seems rather vague on where the parser is and what it's API
is.

So three questions.
For a person for whom expat, libxml and libwww used with ( or without)
HTML Tidy is not good enough, will the parser in Amaya be sufficient?

Is the Amaya code modularised enough to extract the parser?

In terms of the code, where would I start with the procedure.

Thank You
--
Thaddeus L. Olczyk
-----------------------
Think twice, code once.

Received on Wednesday, 30 June 2004 07:02:10 UTC