W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2002

Re: [JTidy] Why are the all fields of org.w3c.tidy.Node protected?

From: Charles Reitzel <creitzel@rcn.com>
Date: Tue, 20 Aug 2002 12:19:39 -0400
Message-Id: <4.3.2.7.2.20020820121512.029f6b70@pop.rcn.com>
To: "Martin Jericho" <mart1041@yahoo.com.au>
Cc: <html-tidy@w3.org>

I think the simple answer is that the parsed text is not null terminated 
and care must be taken when accessing these fields.  Further, Tidy (and 
JTidy) stores characters internally as UTF-8 characters (most of the time), 
so you'll probably need to unmangle the text before you can use 
it.  Probably easier to use Tidy to emit clean XHTML and then use SAX or 
DOM tools to manipate the tree.

take it easy,
Charlie

At 05:33 PM 8/20/2002 +1000, Martin Jericho wrote:
>Is there any reason why the fields start, end, content, next, etc of Node 
>are protected, and without getter methods?
>I would like to parse an HTML file, and simply get the start and end 
>positions of certain elements.  How can I do this when the start and end 
>fields are protected?
>Thanks for any help
>Martin
Received on Tuesday, 20 August 2002 12:11:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:52 GMT