W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2003

Getting a SAX stream from JTidy

From: Shahed Moolji <shahed@enoor.com>
Date: Wed, 8 Jan 2003 11:55:52 -0600
Message-ID: <003f01c2b73f$2ff84e60$8c05a8c0@teleformix.tfmx.com>
To: <html-tidy@w3.org>

Hello,

Is it possible to derive a SAX stream from JTidy ?
I want to further parse the output of parseDOM, and feel that a SAX
processing of the
tree would help.

Alternativly, I tried to parse the output of a *clean* document tree
generated by JTidy using JAXP,
but that did not work. I guess its because when JTidy builds its  DOM tree,
it is more
forgiving than other parsers.


What i really want to do is extract the text block from the example
below :

<SCRIPT>

    <!-- MARK -->
        Text block to be parsed
    <!-- /MARK -->

</SCRIPT>

The DOM tree does not give me offsets into the file for nodes , so I cannot
determine
at what position to cut the text from.


Thanks
Shahed
Received on Wednesday, 8 January 2003 12:40:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:53 GMT