[Bug 14041] inconsistent definitions of safe content for scripts.

http://www.w3.org/Bugs/Public/show_bug.cgi?id=14041

Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xn--mlform-iua@xn--mlform-i
                   |                            |ua.no

--- Comment #1 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2011-09-06 16:56:43 UTC ---
(In reply to comment #0)

I agree w.r.t. '--'. Those situations when '--' (and '-->') inside
<script>/<style> is potentially harmful, are already considered non-conforming
by HTML5 itself. Hence it is "unsafe" (in some sense) even in HTML5 itself.
Therefore I agree that it does not make sense to mention '--' in *this*
definition of "unsafe". But I think 'unsafe' is perhaps not the most telling
word. How about simply 'not polyglot'?

   ...snip...
> As a definition of "safe content" I think
> 
> Content is not "safe" if it contains (after any xml or html entity or character
> references are expanded) the characters < or & or the substring ]]>

The phrase "after any xml or html entity or character references are expanded"
is quite confusing. It is clear that it is XML's "expansionism" that is the
reason why there is a problem. However, it for instance sounds as if you say
that ]]&gt; is dangerous ... And it sounds as if it somehow is possible to
avoid expansion, in XML - is it? I would like to propose the following, as more
hands on and correct:

   NEW DEFINITION PROPOSAL:
"""
   A <script> or <style> is not considered polyglot (that is:
   the XML interpretation will differ from the HTML
   interpretation) if it contains:
      1) any <  (this would begin a tag in XML only)
      2) any &  (this would begin a reference/entity in XML only)
      3) any ]]> (this would be seen as a CDATA end in XML only)
    NOTE:
     * Point 1) means that '<!--'   and '<![CDATA[' inside
       script and style are not polyglot.
     * Point 2) means that HTML entities, XML entities and 
       character references inside script and style are not
       considere polyglot.
"""

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Tuesday, 6 September 2011 16:56:53 UTC