- From: Dave Raggett <dsr@w3.org>
- Date: Thu, 20 Apr 2000 19:21:41 +0100 (GMT Daylight Time)
- To: jgutierrez@siconet.es
- cc: html-tidy@w3.org, gerald@w3.org
On Thu, 13 Apr 2000, jorge gutierrez wrote: > We are building an applet that needs to parse some html response > pages from a server. This pages are dirty so we are plannig to > use Java Tidy to clean up the code the applet receives (we can't > modify the application that generates this pages). > > There is lodsa javascript code into the comments in this pages > (plenty of errors) and Tidy seems to inspect the comments so it > stops at the parsing and doesn't even generate the DOM tree. > > Is there no any way to disable the parsing into comments/scripts? The ANSI C version of Tidy treats the contents of comments and scripts as CDATA and preserves the text as is. Comments are treated as special nodes in the tree, while script and style are treated as regular CDATA elements. I guess I am misunderstanding what is going wrong for you. Perhaps you could send me some examples to make it clearer. Note that I only maintain the C version and not the Java version of Tidy. Regards, -- Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett tel/fax: +44 122 578 3011 (or 2521) +44 778 532 0444 (mobile) World Wide Web Consortium (on assignment from HP Labs)
Received on Thursday, 20 April 2000 14:21:53 UTC