Re: HELP!!

At 09:53 PM 7/24/00 +0100, Christian Stone wrote:
>Does anybody out there in the ether have any suggestions about where I
>can get some information on how to use the HTML parser in JAVA.
>
>I am trying to parse an HTML page and then be able to iterate over the
>parse tree to extract all the <a tags to create a table of links.

I don't know how much documentation is included, but David Brownell has a
tool that lets you use the Java Swing HTML parser to generate
XML-parser-like SAX events, which would at least get you into a
well-documented parsing environment.

See:
http://home.pacbell.net/david-b/xml/

It's in the SAX2 Utilities package.

Information on the SAX2 API is at:
http://www.megginson.com/SAX/

You could collect all the a elements and their attributes in the
StartElement method of your ContentHandler.

I hope that helps...
Simon St.Laurent
XML Elements of Style / XML: A Primer, 2nd Ed.
http://www.simonstl.com - XML essays and books

Received on Monday, 24 July 2000 17:00:02 UTC