- From: Jonas Sicking <jonas@sicking.cc>
- Date: Wed, 1 Apr 2009 00:37:34 -0700
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: Simon Pieters <simonp@opera.com>, Doug Schepers <schepers@w3.org>, HTML WG <public-html@w3.org>, "www-svg@w3.org" <www-svg@w3.org>
On Apr 1, 2009, at 0:17, Henri Sivonen <hsivonen@iki.fi> wrote:
>> How do you
>> feel about my proposal in
>>
>> http://lists.w3.org/Archives/Public/public-html/2009Mar/0634.html
>>
>> It would result in a graded surprise where there's some change
>> between
>> HTML <script> parsing between HTML4 and HTML5, and some surprise in
>> the boundry between SVG-in-HTML and SVG-in-XML.
>
> If this happened in the parser, it would result in <![CDATA[ ... ]]>
> in text/html parsing differently from both XML and previous text/
> html behavior. I think that could be confusing to authors who try to
> form a coherent mental model of the languages they are working with.
No, no changes are intended on the parser side for HTML.
> However, if <![CDATA[ ... ]]> remains in the DOM and is only
> stripped from the data in the JavaScript parser or the CSS parser, I
> suppose that model could count as coherent with the current <!-- -->
> treatment model for script and style in text/html.
Yes, that is the idea.
>>>>> Problems with 2:
>>>>> Just stripping a heading and trailing "<![CDATA[" / "]]>" would
>>>>> break
>>>>> markup like:
>>>>> <style>
>>>>> <![CDATA[
>>>>> rect { fill: yellow; }
>>>>> ]]>
>>>>> <![CDATA[
>>>>> circle { fill: blue; }
>>>>> ]]>
>>>>> </style>
>>>>>
>>>>> which probably happens occasionally due to copy-n-pasting.
>>>
>>> I don't like this, because it requires going back and modifying
>>> buffers that
>>> had been already built instead of just tweaking forward-only
>>> tokenizer state
>>> transitions, and it doesn't even work in the case where there are
>>> multiple
>>> CDATA sections as shown above. If we end up doing something other
>>> than
>>> what's currently in the draft, I'd much rather have what what
>>> Simon proposes
>>> as #4.
>>
>> The stripping doesn't happen at a tokenizer stage. It happens after
>> all parsing is done when the inline data is taken from the DOM and
>> passed to the serializer.
>
> Do you mean passed to the script engine?
Yes, thanks.
> So the string "<![CDATA[" would appear in the content of the text
> node in the DOM?
Yes
> I initially thought you meant removing "<![CDATA[" and "]]>" in the
> tree builder.
No
> What about <![CDATA[ in SVG subtrees outside <script> and <style>?
> It's useful for graceful degradation but still involves feedback to
> the tokenizer unless supported anywhere outside foreign content as
> well.
I think that is mostly an orthogonal issue. But I would like <!
[CDATA[ ]]> in to be parsed as in XML both in foregin content mode,
and in normal mode. To keep things consistent.
Opera has done experimenting with supporting <![CDATA[ ]]> in HTML and
it seems it does not "break the web".
/ Jonas
Received on Wednesday, 1 April 2009 07:37:58 UTC