W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2000

Re: HTML 2 Text

From: Andy Quick <ac.quick@sympatico.ca>
Date: Fri, 24 Mar 2000 11:47:56 -0600
To: <html-tidy@w3.org>
Message-ID: <OFB0C49D05.532588F7-ON862568A7.006B8006@rfdinc.com>

You could use the DOM interface of Java tidy and traverse
the parse tree for TEXT_NODE's, or use getElementsByTagName("p").

Andy Quick

----- Original Message -----
From: Spencer Marks <smarks@digisolutions.com>
To: <html-tidy@w3.org>
Sent: March 18, 2000 1:41 PM
Subject: HTML 2 Text


>
> Hi, I was wondering if there's a way to use Tidy to remove all HTML
> from a page and just get the text.
>
Received on Friday, 24 March 2000 14:13:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT