- From: Jonas Sicking <jonas@sicking.cc>
- Date: Wed, 1 Apr 2009 00:37:34 -0700
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: Simon Pieters <simonp@opera.com>, Doug Schepers <schepers@w3.org>, HTML WG <public-html@w3.org>, "www-svg@w3.org" <www-svg@w3.org>
On Apr 1, 2009, at 0:17, Henri Sivonen <hsivonen@iki.fi> wrote: >> How do you >> feel about my proposal in >> >> http://lists.w3.org/Archives/Public/public-html/2009Mar/0634.html >> >> It would result in a graded surprise where there's some change >> between >> HTML <script> parsing between HTML4 and HTML5, and some surprise in >> the boundry between SVG-in-HTML and SVG-in-XML. > > If this happened in the parser, it would result in <![CDATA[ ... ]]> > in text/html parsing differently from both XML and previous text/ > html behavior. I think that could be confusing to authors who try to > form a coherent mental model of the languages they are working with. No, no changes are intended on the parser side for HTML. > However, if <![CDATA[ ... ]]> remains in the DOM and is only > stripped from the data in the JavaScript parser or the CSS parser, I > suppose that model could count as coherent with the current <!-- --> > treatment model for script and style in text/html. Yes, that is the idea. >>>>> Problems with 2: >>>>> Just stripping a heading and trailing "<![CDATA[" / "]]>" would >>>>> break >>>>> markup like: >>>>> <style> >>>>> <![CDATA[ >>>>> rect { fill: yellow; } >>>>> ]]> >>>>> <![CDATA[ >>>>> circle { fill: blue; } >>>>> ]]> >>>>> </style> >>>>> >>>>> which probably happens occasionally due to copy-n-pasting. >>> >>> I don't like this, because it requires going back and modifying >>> buffers that >>> had been already built instead of just tweaking forward-only >>> tokenizer state >>> transitions, and it doesn't even work in the case where there are >>> multiple >>> CDATA sections as shown above. If we end up doing something other >>> than >>> what's currently in the draft, I'd much rather have what what >>> Simon proposes >>> as #4. >> >> The stripping doesn't happen at a tokenizer stage. It happens after >> all parsing is done when the inline data is taken from the DOM and >> passed to the serializer. > > Do you mean passed to the script engine? Yes, thanks. > So the string "<![CDATA[" would appear in the content of the text > node in the DOM? Yes > I initially thought you meant removing "<![CDATA[" and "]]>" in the > tree builder. No > What about <![CDATA[ in SVG subtrees outside <script> and <style>? > It's useful for graceful degradation but still involves feedback to > the tokenizer unless supported anywhere outside foreign content as > well. I think that is mostly an orthogonal issue. But I would like <! [CDATA[ ]]> in to be parsed as in XML both in foregin content mode, and in normal mode. To keep things consistent. Opera has done experimenting with supporting <![CDATA[ ]]> in HTML and it seems it does not "break the web". / Jonas
Received on Wednesday, 1 April 2009 07:37:57 UTC