[Bug 20201] polyglot markup and extensions via <script> (and <style>) from bugzilla@jessica.w3.org on 2013-05-24 (public-html-bugzilla@w3.org from May 2013)

From: <bugzilla@jessica.w3.org>
Date: Fri, 24 May 2013 14:35:42 +0000
To: public-html-bugzilla@w3.org
Message-ID: <bug-20201-2486-341eZSrXBD@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=20201

Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|---                         |WORKSFORME

--- Comment #10 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> ---
(In reply to comment #9)
> We still haven't seen strong use-case for allowing CDATA which will
> overweight complex CDATA escaping machinery.


EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the Editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the Tracker Issue; or you may create a Tracker Issue
yourself, if you are able to do so. For more details, see this document:

   http://dev.w3.org/html5/decision-policy/decision-policy-v3.html

Status: rejected

Rationale: In comment #8, you bring in use cases versus what you call 'complex
CDATA escaping machinere'. This is new information, to which I will hereby
reply:

The intent of allowing CDATA is 

  A) to *decrease* the need to perform complex escaping 
  B) to allow syntax that is forbidden/impossible without CDATA

    **WHY CDATA IS A SIMPLIFICATION**

CDATA is a *mental* complexity. But coding wise, it is a simplification. The
use of CDATA means that authors, within the CDATA section, can code according
to the exact same rules that authors can operate with when they are creating
"monoglot" text/HTML pages.

Without CDATA, one must use e.g. JavaScript character escapes instead of HTML
character escapes (character entities) - or not use escapes as all (which is
best - and which actually is a justification for allowing CDATA! ). Thus,
without CDATA, authors of polyglot markup are *forced* to handle two escape
mechanism. Whereas without this, they can often deal with just one. And this,
in my view refutes your complaint about complex CDATA escaping.

David Carlisle has suggested that one could hide illegal scripts as a DATA URIs
inside the @src attribute of the <script> elemment. And, true, this could work
- and can be a useful trick, one should not forget it. But is much more complex
form of escaping machinery than it is to providing the script inside CDATA.


    **WHY CDATA HAS USE CASES THAT ARE IMPORTANT ENOUGH**

If one agrees that CDATA actually offers simplication, then the bar (that this:
the question of whether the use cases important) for permitting CDATA should
obviously be lowered.  

And when when it comes to use cases, then, of greater concern than the
simplificaiton is it that without CDATA, it is impossible to express all
JavaScript within <script>. E.g. strings like these - which are common in
JavaScript - would be impossible and thus forbidden:

    x && y
    x < y

I see no reason to believe that these native JavaScript strings are less
necessary in Polyglot than in monoglots. Secondly, the script and style
elements are extension points of HTML. And so for instance the AmpleSDK
(http://www.amplesdk.com) defines its own markup, such as
'application/ample+xml', which it inserts into <script>. This is taken from the
Hello World example of AmpleSDK:
        <script type="application/ample+xml">
            <b onclick="alertHelloWorld(event)">Hello, World!</b>
        </script>
And John Resig of JQuery fame uses the same technique for what he calles 'micro
templating' and which he describes as a "super-simple templating function that
is fast, caches quickly, and is easy to use."
http://ejohn.org/blog/javascript-micro-templating/ The benefits that Resig
describes there, are benefits also to users of polyglot markup.

Futher more, Lachlan Hunt, who objected to the publication of Polyglot Markup,
has pointed out that the restriction against CDATA doesn't permit scripts to be
auto-generated because, as soon as the script contains an illegal character,
the page suddenly doesn't conform to polyglot markup anymore.

    **CONCLUSIONS**

Recommending authors to keep script off page, is good. And polyglot do
inherently recommend this since it forbids certain characters - unless one
includes them inside CDATA. (The need to use CDATA is already often used as a
justification for using off-page scripts.) Monoglot markup does not have this
inherent encouragement.

However, inline stylesheets and scripts do have its usecases and fore these,
then CDATA allows users to both simplify their stylesheets (to small degree)
and scripts (to a high degree) as well as that it allows constructs that would
be impossible in polyglot markup without it. Thus, for the usecase of a
stylesheet or a script that should be keept inside the page (rather than being
linked to in an external file), forbidding CDATA makes things very complicated
or even impossible.

Thus I consider that I have provided usecases and thus, that your objection has
been answered. That said, if I find, or hear of, ways to explain CDATA more
simply etc, so that authors do not get the impression that it is a 'complex
machinery', then I will add them promptly. I am also willing to add info about
usinc data URI for the escaping.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Friday, 24 May 2013 14:35:49 UTC