Re: tidy and scripts

"Jany Quintard" <jany.quintard@fr.ibm.com> wrote:
	It has worked fine since months (years ?), but I have now to include some
	
	Here is an input (very simple):
	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Frameset//EN">
	<html
	><head
	><title
	>Titre</title
	></head
	><body
	><script
	>document.writeln ("</frameset></frameset></frameset>" ) ;
                            ^^

The <script> element ends there.  There was a rather nasty design botch
in SGML, which HTML inherited.  Elements whose content is CDATA or
RCDATA end at the first "</", even if it is not followed by the right
tag.

	</script
	></body
	></html
	>
	and the output (I run tidy with options -i -raw)
	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Frameset//EN">
	<html>
	  <head>
	    <meta name="generator" content="HTML Tidy, see www.w3.org">
	    <title>Titre</title>
	  </head>
	
	  <body>
	<script type="text/javascript">
	document.writeln ("<\/frameset><\/frameset><\/frameset>" ) ;
	<\/script
	><\/body
	><\/html
	>
	</script>
	  </body>
	</html>
	
	So, the <\/frameset> are not what I want.

They may not be what you WANT, but they are most certainly what you NEED.

	And the last tags are strangely
	mangled and appear twice.

Basically caused by the same problem, I believe.
Fix the script does that it does not contain a literal "</" anywhere,
and see what you get.

Received on Sunday, 9 September 2001 20:14:21 UTC