W3C home > Mailing lists > Public > www-validator-cvs@w3.org > February 2010

[Bug 8852] New: HTML4 validator doesn't accept <![CDATA[ <tag></tag> ]]> inside the <script> element

From: <bugzilla@wiggum.w3.org>
Date: Mon, 01 Feb 2010 04:48:29 +0000
To: www-validator-cvs@w3.org
Message-ID: <bug-8852-169@http.www.w3.org/Bugs/Public/>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=8852

           Summary: HTML4 validator doesn't accept <![CDATA[ <tag></tag> ]]>
                    inside the <script> element
           Product: Validator
           Version: HEAD
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: major
          Priority: P1
         Component: Parser
        AssignedTo: dave.null@w3.org
        ReportedBy: xn--mlform-iua@xn--mlform-iua.no
         QAContact: www-validator-cvs@w3.org


The HTML4 validator doesn't accept the following <script> example as validating
HTML4 code (while Validator.nu accepts it as validating HTML5 ...)

<script type="text/javascript"><![CDATA[
    document.write("<aa><bb></bb></aa>");
]]></script>

In comparison, the following *does* validate as HTML4 (while Validator.nu
*doesn't* accept it as HTML5):

<p><![CDATA[
<aa><bb></bb></aa>
]]></p>

Both of code examples should be stamped as validating HTML4 code! 

According to the spec ( http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.5
) then HTML4 does include support for marked sections for CDATA content. Marked
CDATA sections permits authors to skip using escape tags: <![[CDATA[ 
<tag>marked section</tag>  ]]>.

If one wants to create polyglot JavaScript scripts for use both in HTML4 as
well as XHTML documents, then it is crucial that the validator gives correct
information w.r.t.  the validity of <![CDATA[ ... ]]>, since <![CDATA[ ...]]>
sections are needed in order to embed scripts in XHTML.  The current validator
bug makes it seem unneccessary difficult to create "polyglot scripts".

Some background to explain a possible counter argument:

Section 18.2.4 of HTML4
(http://www.w3.org/TR/html4/interact/scripts.html#h-18.2.4) gives an scripting
example where the end tag "</b>" inside the script  element has been escaped
with the backslash character: "<\/b>". This has been done in order  that the
code doesn't break the SGML parsing rules, according to which the first
occurrence of "</" would have had implications.

The code example in section 18.2.4 is supposed to examplify what is said in the
preceding text:

]]
HTML documents are constrained to conform to the HTML DTD both before and after
processing any SCRIPT elements.
[[


However, it is not expressed anywhere that one cannot use the <![CDATA[ ... ]]>
construct  in HTML4 documents!  In XHTML it is customary to precede the start
and end "tag" of the CDATA section with a Javascript escape code - for example
like the following - in order to be both valid from the XHTML angle and from
the JavaScript angle. For example, it can be done like this:

<script type="text/javascript">//<![CDATA[
    document.write("<aa><bb></bb></aa>");
//]]></script>

And this should clearly not be stamped as invalid HTML4 code either! 
(Sidenote: The script doesn't appear to run in Internet Explorer (version 6 at
least) and also not in Webkit, if  the first content of the script element is a
"<![CDATA[ ", so there are several reasons to do the escaping.)


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Monday, 1 February 2010 04:48:31 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:17:41 UTC