- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Mon, 6 Apr 2009 21:54:47 +0300
- To: Jonas Sicking <jonas@sicking.cc>
- Cc: Simon Pieters <simonp@opera.com>, Doug Schepers <schepers@w3.org>, HTML WG <public-html@w3.org>, "www-svg@w3.org" <www-svg@w3.org>
On Apr 1, 2009, at 10:37, Jonas Sicking wrote:
>>>>>> Problems with 2:
>>>>>> Just stripping a heading and trailing "<![CDATA[" / "]]>" would
>>>>>> break
>>>>>> markup like:
>>>>>> <style>
>>>>>> <![CDATA[
>>>>>> rect { fill: yellow; }
>>>>>> ]]>
>>>>>> <![CDATA[
>>>>>> circle { fill: blue; }
>>>>>> ]]>
>>>>>> </style>
>>>>>>
>>>>>> which probably happens occasionally due to copy-n-pasting.
>>>>
>>>> I don't like this, because it requires going back and modifying
>>>> buffers that
>>>> had been already built instead of just tweaking forward-only
>>>> tokenizer state
>>>> transitions, and it doesn't even work in the case where there are
>>>> multiple
>>>> CDATA sections as shown above. If we end up doing something other
>>>> than
>>>> what's currently in the draft, I'd much rather have what what
>>>> Simon proposes
>>>> as #4.
>>>
>>> The stripping doesn't happen at a tokenizer stage. It happens after
>>> all parsing is done when the inline data is taken from the DOM and
>>> passed to the serializer.
>>
>> Do you mean passed to the script engine?
>
> Yes, thanks.
>
>> So the string "<![CDATA[" would appear in the content of the text
>> node in the DOM?
>
> Yes
If "<![CDATA[" ends up in the DOM, I think the end result could be
made more robust if the operation of handing DOM data to the CSS or JS
parser didn't try to drop "<![CDATA[" and "]]>" but instead the JS and
CSS parser were changed to treat those strings as comments, i.e. like
"/* */". This way, they wouldn't be dropped from within potentially
existing string literals.
This approach would cause notable leakage of the SVG-in-text/html
feature into other parts of a browser engine, though, which isn't very
nice.
Also, I'm a bit concerned that letting "<![CDATA[" and "]]>" reach the
DOM would result in those strings being escaped as ">![CDATA[" and
"]]<" if serialized to XML, so going back and forth a couple of
times through real serializer and via copying and pasting would result
in some ugly cruft.
>> What about <![CDATA[ in SVG subtrees outside <script> and <style>?
>> It's useful for graceful degradation but still involves feedback to
>> the tokenizer unless supported anywhere outside foreign content as
>> well.
>
> I think that is mostly an orthogonal issue. But I would like <!
> [CDATA[ ]]> in to be parsed as in XML both in foregin content mode,
> and in normal mode. To keep things consistent.
I think it's relevant in two ways:
1) If the syntax behaves as in XML outside <script> and <style> but
not as in XML inside <script> and <style>, the result may be confusing.
2) Having CDATA sections that behave like XML CDATA sections in HTML5
parsers but like bogus comments in earlier browsers is useful for
hiding SVG text from old browsers for graceful degradation. However,
if this syntax causes feedback from the tree builder to the tokenizer,
we haven't managed to completely eliminate the (non-trivial) feedback
to the tokenizer meaning the other efforts to do so wouldn't be very
useful.
--
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Monday, 6 April 2009 18:55:38 UTC