W3C home > Mailing lists > Public > www-validator@w3.org > February 2010

Re: HTML4 + <script><![CDATA[ </ENDTAG> ]]></script>

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Fri, 5 Feb 2010 04:56:32 +0100
To: David Dorward <david@dorward.me.uk>
Cc: www-validator@w3.org
Message-ID: <20100205045632351611.eebe4fde@xn--mlform-iua.no>
David Dorward, Mon, 1 Feb 2010 10:51:16 +0000:
 
> ... or you could just use a JS comment rather than depending on hacks 
> designed to avoid having Netscape 2 and friends render JS as text 
> (which would break if the script was placed in an external file).

The Netscape hack comes in handy, to save HTML4 from itself! The two 
examples below validates both as XHTML and HTML4 - and at the same time 
they eliminate the need for end tag escaping in HTML4. 

That we do not need to escape the end tags also means that we can - 
very easily - place SVG elements and other HTML4 foreign content inside 
the <script> element without being bothered by annoying validation 
error messages.

<script type=""><![CDATA[</><!--
... Variant 1: the script = a comment for HTML4 & CDATA for XHTML
--><script type="">]]></script>

<script type=""><!--</><![CDATA[
... Variant 2: the script = CDATA for HTML4 & a comment for XHTML
]]><script type="">--></script>

The key is to first send a signal to XHTML (<!-- or <![CDATA[) about 
whether the <script> element contains a comment or CDATA - HTML4 
doesn't perceive this signal.  Then we close the <script> element, from 
HTML4's point of view, by inserting the simple and esoteric '</>' tag. 
Thereafter we send the opposite signal to HTML4: If XHTML were told 
that the script contains CDATA, then we serve HTML4 a comment - and 
vice versa.

The last line in each variant begins with the end marker from HTML4's 
point of view. Then a new script start tag is inserted, on behalf of 
HTML4 - XHTML doesn't see this start tag as a start tag. Inside the new 
script element we place the end marker which only XHTML sees (since the 
only thing HTML4 looks for inside <script>, is end tags). Voila!

I think I will recommend in general using the first variant. The reason 
being that in HTML4, after we have used the </>, then we are 
syntactically outside the <script> element - even if User Agents don't 
see it like that. And since  <![CDATA[ ]]> is not permitted to appear 
as direct child of <body>, then Variant 2 thus includes the risk that 
the CDATA section makes your document invalid as a HTML4 document.

Of course, to be able to function with as actual scripts one may also 
have escape the first and last line in each script using script 
comments, as well - e.g. like this, for JavaScript:

<script type="">//<![CDATA[</><!--
document.write('<p>Variant 1</p>');
//--><script type="">]]></script>

<script type="">//<!--</><![CDATA[
document.write('<p>Variant 2</p>');
//]]><script type="">--></script>
-- 
leif halvard silli
Received on Friday, 5 February 2010 05:46:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:39 GMT