W3C home > Mailing lists > Public > public-html@w3.org > March 2009

Re: CDATA, Script, and Style

From: Simon Pieters <simonp@opera.com>
Date: Wed, 25 Mar 2009 15:24:32 +0100
To: "Jonas Sicking" <jonas@sicking.cc>, "Doug Schepers" <schepers@w3.org>
Cc: "HTML WG" <public-html@w3.org>, www-svg@w3.org
Message-ID: <op.urcqa61cidj3kv@zcorpandell.linkoping.osa>
On Thu, 19 Mar 2009 18:52:25 +0100, Jonas Sicking <jonas@sicking.cc> wrote:

> My feelings on 1 vs. 2 is:
> Problems with 1:
> Parsing <![CDATA[]]> inside a CDATA element "feels" weird. Parsing for
> CDATA has remained largely the same since the dawn of human kind
> (well, the particular branch of human kind that supports SGML). But
> the bigger problem with supporting <!CDATA[]]> inside <script> is that
> it'd break existing HTML content like:
> <script>
> x = "<res><![CDATA[if a < b < c then they are sorted]]></res>";
> var parser = new DOMParser();
> var doc = parser.parseFromString(x, "text/xml");
> xhr = new XMLHttpRequest();
> xhr.open("POST", uri);
> xhr.send(doc);
> </script>
> Problems with 2:
> Just stripping a heading and trailing "<![CDATA[" / "]]>" would break
> markup like:
> <style>
> <![CDATA[
> rect { fill: yellow; }
> ]]>
> <![CDATA[
> circle { fill: blue; }
> ]]>
> </style>
> which probably happens occasionally due to copy-n-pasting.
> So neither solution is perfect. Though I'm thinking that 2 will
> probably cause less trouble with existing content.

(3) Have a "dirty" flag that's initially false and is set to true when you see non-whitespace other than the string "<![CDATA[", which is stripped and sets the insertion mode to "in CDATA section in CDATA element" which eats the next "]]>" and switches back to the previous insertion mode and resets the dirty flag.

However this still wouldn't handle stuff like <script><![CDATA[x = "<res><script></script></res>"]]></script> (which the <script><!--...--></script> syntax supports).

Also, should we support CDATA sections in RCDATA elements? Should they make entities not be expanded there?

(4) Make <![CDATA and ]]> equivalent to <!-- and --> in (R)CDATA, except that they are stripped.

I think in general you'd be pretty lucky if you didn't have to modify scripts in SVG when pasted into text/html, so requiring authors to remove the CDATA strings or prepend them with // isn't too much to ask for, IMHO. (Therefore I continue to think that it's ok to not support CDATA sections anywhere in text/html -- assuming that SVG <script> becomes a CDATA element, that is.)

Simon Pieters
Opera Software
Received on Wednesday, 25 March 2009 14:25:15 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:44 UTC