W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2008

Re: Using TidyLib as an HTML parser

From: Arnaud Desitter <arnaud02@users.sourceforge.net>
Date: Tue, 22 Jan 2008 12:59:05 +0000
Message-ID: <a240ddd00801220459x307112b8n10303cb1cd854ff9@mail.gmail.com>
To: "John Snelson" <john.snelson@oracle.com>
Cc: html-tidy@w3.org

On 22/01/2008, John Snelson <john.snelson@oracle.com> wrote:
>
> Hi,
>
> I'm trying to use TidyLib as an HTML parser, and would like to generate
> SAX events from the TidyDoc representation of the document. However,
> there doesn't seem to be a way to get the unescaped value of a text
> node, or the unserialized value of a comment or processing instruction.
> I have been using the tidyNodeGetText() method to get the value of these
> node types.
>
> Is there a better way to do what I want? I would be quite happy to
> implement a new API method to do this if that's required - does anyone
> else think this would be useful?

Please refer to http://tidy.sf.net/issue/1636028.
Your contribution to a new API would be welcome. Please post it using the
tidy patch tracker.

Regards,

>
> John
>
> --
> John Snelson, Oracle Corporation            http://snelson.org.uk/john
> Berkeley DB XML:        http://www.oracle.com/database/berkeley-db/xml
> XQilla:                                  http://xqilla.sourceforge.net
>
>
Received on Tuesday, 22 January 2008 13:00:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:58 GMT