Fwd: [Bug 10901] New: Use same parsing for HTML <script> and SVG <script>

FYI.

-------- Original Message --------
Subject: [Bug 10901] New: Use same parsing for HTML <script> and SVG 
<script>
Date: Thu, 30 Sep 2010 20:43:50 +0000
From: bugzilla@jessica.w3.org
To: schepers@w3.org

http://www.w3.org/Bugs/Public/show_bug.cgi?id=10901

            Summary: Use same parsing for HTML <script> and SVG <script>
            Product: HTML WG
            Version: unspecified
           Platform: PC
         OS/Version: All
             Status: NEW
           Severity: normal
           Priority: P2
          Component: HTML5 spec (editor: Ian Hickson)
         AssignedTo: ian@hixie.ch
         ReportedBy: jonas@sicking.cc
          QAContact: public-html-bugzilla@w3.org
                 CC: schepers@w3.org, mike@w3.org,
                     public-html-wg-issue-tracking@w3.org,
                     public-html@w3.org, jwatt@jwatt.org


This is a topic that I raised a long time ago, but I have some new 
information
which makes it worth to reconsider.

The basic problem is this:
Currently a <script> element in non-"foreign content" (i.e. inside 
normal HTML)
is parsed significantly different from a <script> element in "foreign 
content"
mode (for example inside <svg>).

This makes it harder to work with pages that contain a mix of "foreign 
content"
and non-"foreign content". If a <script> element is moved from inside 
<svg> to
the html <head>, it is likely to stop parsing correctly. Similarly, 
moving or
copying a small snippet of <script> from elsewhere in the page to inside a
<svg> will likely not work as the author intended.

This despite the fact that <script> beside parsing basically have the same
processing model. In fact, I argue that we should work to make the 
processing
models line up even more over time, for example by adding 'defer' and 
'async'
to svg-script.


When I initially raised this request, it was rejected since Hixie had heard
from the SVG people that they wanted to very strictly ensure that all SVG
contents could be copy and pasted directly into HTML while being guaranteed
that it would work. However aligning the parsing of <script> for "foreign
content" and non-"foreign content" would break this in a few rare edge 
cases.

Since then I have raised the question directly with the SVG WG at a F2F 
and it
was agreed that the edge cases were likely rare enough that the benefits of
aligning the parsing models outweigh the disadvantages.


Here is what I propose:

When in the "Script data state" in the tokenizer, if the string <![CDATA[ is
found, transition to a new "Script data cdata state".

When in "Script data cdata state" in the tokenizer, emit all characters as
character tokens until EOF or the string "]]>" is found. If "]]>" is found,
switch to the "Script data state".

When in "in foreign content" insertion mode, when seeing a start tag token
named "script", put the tokenizer in "Script data state".



There are probably some bugs in the above, but I hope you get the basic 
idea?

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Thursday, 30 September 2010 21:36:35 UTC