W3C home > Mailing lists > Public > www-svg@w3.org > March 2009

Re: CDATA, Script, and Style

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 31 Mar 2009 15:08:56 -0700
Message-ID: <63df84f0903311508v12135da8p95ebc81dd1d58dae@mail.gmail.com>
To: Henri Sivonen <hsivonen@iki.fi>
Cc: Simon Pieters <simonp@opera.com>, Doug Schepers <schepers@w3.org>, HTML WG <public-html@w3.org>, www-svg@w3.org
On Tue, Mar 31, 2009 at 2:30 AM, Henri Sivonen <hsivonen@iki.fi> wrote:
> On Mar 25, 2009, at 16:24, Simon Pieters wrote:
>> On Thu, 19 Mar 2009 18:52:25 +0100, Jonas Sicking <jonas@sicking.cc>
>> wrote:
>>> My feelings on 1 vs. 2 is:
>>> Problems with 1:
>>> Parsing <![CDATA[]]> inside a CDATA element "feels" weird.
> I agree that it feels weird.
> I think the biggest problem with this entire issue is that the difference
> between HTML <script> and <script> in XML is surprising and unintuitive, so
> we will have a surprise boundary somewhere no matter what. It seems on the
> general level we have the following options:
>  1) Have the surprise boundary between text/html and XML. (The situation
> before SVG-in-text/html)
>  2) Have the surprise boundary between HTML <script> in text/html and
> everything else. (The situation with SVG-in-text/html as drafted)
>  3) Have graded surprises with two boundaries:
>    a) Have a surprise boundary between HTML <script> and SVG-in-text/html
> <script> and another between SVG-in-text/html <script> and XML.
>    b) Have a surprise boundary between pre-HTML5 <script> and HTML5
> text/html <script>s and another between text/html and XML.
> I'm worried about escaping surprises in general having seen the RSS <title>
> epic fail.

I'm a little unclear as to what the behaviors in 3 are. I.e. which
parsing/processing algorithms would lead to the two scenarios you

I'm also unclear as to what behavior you are proposing. How do you
feel about my proposal in


It would result in a graded surprise where there's some change between
HTML <script> parsing between HTML4 and HTML5, and some surprise in
the boundry between SVG-in-HTML and SVG-in-XML.

>>> Problems with 2:
>>> Just stripping a heading and trailing "<![CDATA[" / "]]>" would break
>>> markup like:
>>> <style>
>>> <![CDATA[
>>> rect { fill: yellow; }
>>> ]]>
>>> <![CDATA[
>>> circle { fill: blue; }
>>> ]]>
>>> </style>
>>> which probably happens occasionally due to copy-n-pasting.
> I don't like this, because it requires going back and modifying buffers that
> had been already built instead of just tweaking forward-only tokenizer state
> transitions, and it doesn't even work in the case where there are multiple
> CDATA sections as shown above. If we end up doing something other than
> what's currently in the draft, I'd much rather have what what Simon proposes
> as #4.

The stripping doesn't happen at a tokenizer stage. It happens after
all parsing is done when the inline data is taken from the DOM and
passed to the serializer. See the details in the link above.

/ Jonas
Received on Tuesday, 31 March 2009 22:09:57 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:29:40 UTC