- From: Charles Reitzel <creitzel@rcn.com>
- Date: Tue, 20 Aug 2002 12:19:39 -0400
- To: "Martin Jericho" <mart1041@yahoo.com.au>
- Cc: <html-tidy@w3.org>
I think the simple answer is that the parsed text is not null terminated and care must be taken when accessing these fields. Further, Tidy (and JTidy) stores characters internally as UTF-8 characters (most of the time), so you'll probably need to unmangle the text before you can use it. Probably easier to use Tidy to emit clean XHTML and then use SAX or DOM tools to manipate the tree. take it easy, Charlie At 05:33 PM 8/20/2002 +1000, Martin Jericho wrote: >Is there any reason why the fields start, end, content, next, etc of Node >are protected, and without getter methods? >I would like to parse an HTML file, and simply get the start and end >positions of certain elements. How can I do this when the start and end >fields are protected? >Thanks for any help >Martin
Received on Tuesday, 20 August 2002 12:11:38 UTC