W3C home > Mailing lists > Public > www-validator@w3.org > February 2010

(unknown charset) Re: HTML4 + <script><![CDATA[ </ENDTAG> ]]></script>

From: (unknown charset) Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Mon, 1 Feb 2010 11:03:03 +0100
To: (unknown charset) David Dorward <david@dorward.me.uk>
Cc: (unknown charset) www-validator@w3.org
Message-ID: <20100201110303507439.fb44bb0f@xn--mlform-iua.no>
David Dorward, Mon, 1 Feb 2010 07:56:15 +0000:
> On 1 Feb 2010, at 06:19, Leif Halvard Silli wrote:
> 
>> The validator doesn't consider the following code as valid HTML4 (HTML 
>> four):
>> 
>> <script type="text/javascript">//<![CDATA[
>>   document.write("<aa><bb></bb></aa>");
>> //]]></script>
> 
> Since <script> elements are defined as containing CDATA, I assume the 
> <![CDATA marker is (supposed to be) treated as character data and not 
> markup. The </ of </bb> is then considered to be an end tag which 
> fails to match the opening <script> tag.

So you say that placing "<![CDATA[" there is i valid, but without 
effect on the escaping needs ... 

>> There are 3 reasons why this bug is important to fix:
>> 
>> (1) That the validator wrongly stamps the first example as invalid 
>> creates the impression that it is very difficult to embed javascript in 
>> a way that is valid both inside XHTML and inside HTML4. 
> 
> It is difficult. The HTML compatibility guidelines for XHTML 
> recommend using external scripts.

You meant: "It _is_ difficult", I presume. ;-)

But never the less: This means that the HTML4 parser compatibility 
guidelines are incomplete.

http://www.w3.org/TR/xhtml-media-types/
http://www.w3.org/TR/xhtml1/guidelines.html

Or, why doesn't the compatibility guidelines mention that, for 
embedding, then one should use BOTH "\/" and "<![CDATA[ ...]]>" 
simultaneously? Instead it jumps on to say that you should instead use 
external scripts?!

So: If you develop a script to be embedded freely both in HTML4 
documents as well as in XHTML documents, then you must escape both the 
HTML4 way and the XHTML1 way:

<script type="text/javascript"><![CDATA[
<abc><\/abc>
]]></script>

And, in addition, you should as well take care of the javascript 
interpreter - instead of recommending to not use HTML comments at all, 
like the guidelines does/do, I would recommend this:

<script type="text/javascript"><!---><![CDATA[
document.write('<abc>abc<\/abc>');
<!---->]]></script>

Because, the javascript interpreters doesn't require more than that 
line beings with a "<!--" in order accept that the first line is a 
comment. If the code also ends with a "-->", then it can as well be 
interpreted as a valid HTML comment.
 
>> (2) In addition, it is also useful within HTML4! Because: the HTML4 
>> specification (as well as the validator) requires that end tags inside 
>> the <script> element are escaped - in order to be valid SGML. The HTML4 
>> spec gives the following example as example of _one_ way that one can 
>> escape the code so that the code is valid SGML both before and after 
>> script execution: "<\/b>".
> 
> Yes

See above.
 
>> *However*, the <![CDATA[ ... ]]> syntax for 
>> marking up a section where escaping is not necessary is documented in 
>> the HTML4 specification as well.
> 
> But overruled, I believe, by: "Although the STYLE and SCRIPT elements 
> use CDATA for their data model, for these elements, CDATA must be 
> handled differently by user agents. Markup and entities must be 
> treated as raw text" 
> <http://www.w3.org/TR/html4/types.html#type-cdata>

OK, thanks David. Much appreciated. It would really be helpful if the 
HTML4 validator recommended escaping the "\/" instead of (only) telling 
us to use external script files. Or instead of the cryptic message that 
the HTML4 validation serves now gives us. In addition, there should be 
some *XHTML* compatibility guidelines for *HTML4*. ;-)
-- 
leif halvard silli
Received on Monday, 1 February 2010 10:03:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:39 GMT