Re: Using TidyLib as an HTML parser

On 22/01/2008, John Snelson <john.snelson@oracle.com> wrote:
>
> Hi,
>
> I'm trying to use TidyLib as an HTML parser, and would like to generate
> SAX events from the TidyDoc representation of the document. However,
> there doesn't seem to be a way to get the unescaped value of a text
> node, or the unserialized value of a comment or processing instruction.
> I have been using the tidyNodeGetText() method to get the value of these
> node types.
>
> Is there a better way to do what I want? I would be quite happy to
> implement a new API method to do this if that's required - does anyone
> else think this would be useful?

Please refer to http://tidy.sf.net/issue/1636028.
Your contribution to a new API would be welcome. Please post it using the
tidy patch tracker.

Regards,

>
> John
>
> --
> John Snelson, Oracle Corporation            http://snelson.org.uk/john
> Berkeley DB XML:        http://www.oracle.com/database/berkeley-db/xml
> XQilla:                                  http://xqilla.sourceforge.net
>
>

Received on Tuesday, 22 January 2008 13:00:35 UTC