You could use the DOM interface of Java tidy and traverse the parse tree for TEXT_NODE's, or use getElementsByTagName("p"). Andy Quick ----- Original Message ----- From: Spencer Marks <smarks@digisolutions.com> To: <html-tidy@w3.org> Sent: March 18, 2000 1:41 PM Subject: HTML 2 Text > > Hi, I was wondering if there's a way to use Tidy to remove all HTML > from a page and just get the text. >Received on Sunday, 19 March 2000 14:33:29 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT