Re: JavaScript and Java HTML Tidy

On Thu, 13 Apr 2000, jorge gutierrez wrote:

> We are building an applet that needs to parse some html response
> pages from a server. This pages are dirty so we are plannig to
> use Java Tidy to clean up the code the applet receives (we can't
> modify the application that generates this pages).
> 
> There is lodsa javascript code into the comments in this pages
> (plenty of errors) and Tidy seems to inspect the comments so it
> stops at the parsing and doesn't even generate the DOM tree.
> 
> Is there no any way to disable the parsing into comments/scripts?

The ANSI C version of Tidy treats the contents of comments
and scripts as CDATA and preserves the text as is. Comments are
treated as special nodes in the tree, while script and style are
treated as regular CDATA elements.

I guess I am misunderstanding what is going wrong for you. Perhaps
you could send me some examples to make it clearer. Note that I
only maintain the C version and not the Java version of Tidy.


Regards,

-- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett
tel/fax: +44 122 578 3011 (or 2521) +44 778 532 0444 (mobile)
World Wide Web Consortium (on assignment from HP Labs)

Received on Thursday, 20 April 2000 14:21:53 UTC