- From: hoenicka_markus <hoenicka_markus@compuserve.com>
- Date: Sat, 21 Oct 2000 01:36:03 -0400 (EDT)
- To: html-tidy list <html-tidy@w3.org>
Hi, I came across an odd behaviour of tidy which might qualify as a bug. The following HTML file looks unusual, but is valid HTML in my understanding (ok, it lacks the type attribute, but that's not the point here): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html ><head ><title >Test</title ><script language="JScript" ></script ></head ><body ></body ></html> If I run this through tidy, I see a boatload of warnings: Tidy (vers 4th August 2000) Parsing "test.html" line 6 column 2 - Warning: <script> lacks "type" attribute line 8 column -1 - Warning: '<' + '/' + letter not allowed here line 8 column 1 - Warning: '<' + '/' + letter not allowed here line 9 column -1 - Warning: '<' + '/' + letter not allowed here line 10 column 1 - Warning: '<' + '/' + letter not allowed here line 11 column -1 - Warning: '<' + '/' + letter not allowed here line 11 column 1 - Warning: '<' + '/' + letter not allowed here line 11 column 6 - Warning: '<' + '/' + letter not allowed here line 11 column 6 - Warning: missing </script> If I shift one chevron up one line (the line "></script" becomes "></script>"): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html ><head ><title >Test</title ><script language="JScript" ></script> </head ><body ></body ></html> then all those warnings are gone (except the type attribute). I have seen some comments in the archive about the special treatment of the </script> end tag, but this should not apply here. To me it looks like tidy accepts the script CDATA as correctly terminated when it finds "</script>". I think it should accept it when it finds "</script" (zero or more whitespace) ">". This would be consistent with the termination of all other elements. The background of this somewhat odd question is (Open)Jade. This tool can generate HTML output from SGML sources. It creates linebreaks like in the examples above, as these are the only locations for linebreaks where Jade can be sure not to screw up the contents. I am currently writing DSSSL stylesheets that insert JavaScript/JScript stuff into the generated HTML pages. Both Lynx and tidy cannot handle the automatically generated HTML properly, while IE and Netscape are fine. Please let me know if I'm off track here. regards, Markus hoenicka_markus@compuserve.com
Received on Saturday, 21 October 2000 01:36:41 UTC