At 09:53 PM 7/24/00 +0100, Christian Stone wrote: >Does anybody out there in the ether have any suggestions about where I >can get some information on how to use the HTML parser in JAVA. > >I am trying to parse an HTML page and then be able to iterate over the >parse tree to extract all the <a tags to create a table of links. I don't know how much documentation is included, but David Brownell has a tool that lets you use the Java Swing HTML parser to generate XML-parser-like SAX events, which would at least get you into a well-documented parsing environment. See: http://home.pacbell.net/david-b/xml/ It's in the SAX2 Utilities package. Information on the SAX2 API is at: http://www.megginson.com/SAX/ You could collect all the a elements and their attributes in the StartElement method of your ContentHandler. I hope that helps... Simon St.Laurent XML Elements of Style / XML: A Primer, 2nd Ed. http://www.simonstl.com - XML essays and booksReceived on Monday, 24 July 2000 17:00:02 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 6 April 2009 12:59:10 GMT