RE: [webcomponents] More backward-compatible templates

These interesting thought experiments have solidified my view that <template> as current described in the spec, is the right approach.

From: Adam Barth [mailto:w3c@adambarth.com]
Sent: Thursday, November 1, 2012 5:15 PM
To: Hajime Morrita
Cc: Maciej Stachowiak; Anne van Kesteren; public-webapps@w3.org WG
Subject: Re: [webcomponents] More backward-compatible templates

On Thu, Nov 1, 2012 at 8:42 AM, Hajime Morrita <morrita@google.com<mailto:morrita@google.com>> wrote:
A naive proposal: Can we introduce an alias of <script> element, like <scr>, and ask template authors to use <scr> instead of <script> inside <script template>?  Since <scr> is just an alias of <script>, authors can use it even outside <script template>.

The problem is then that you can't have doubly nested templates because <scr> wouldn't be able to nest inside itself.

Adam


I guess it won't confuse neither the tokenizer nor existing parser, and it will be polyfill-friendly. It isn't as clean as <template>, but not as ugly as <+script> IMO. Obviously "scr" sounds ugly so it is great if it has some unused sensible name.



On Thu, Nov 1, 2012 at 3:14 PM, Adam Barth <w3c@adambarth.com<mailto:w3c@adambarth.com>> wrote:


On Thu, Nov 1, 2012 at 6:33 AM, Maciej Stachowiak <mjs@apple.com<mailto:mjs@apple.com>> wrote:

On Nov 1, 2012, at 1:57 PM, Adam Barth <w3c@adambarth.com<mailto:w3c@adambarth.com>> wrote:




(5) The nested template fragment parser operates like the template fragment parser, but with the following additional difference:
     (a) When a close tag named "+script" is encountered which does not match any currently open script tag:

Let me try to understand what you've written here concretely:

1) We need to change the "end tag open" state to somehow recognize "</+script>" as an end tag rather than as a bogus comment.
2) When the tree builder encounter such an end tag in the ???? state(s), we execute the substeps you've outlined below.

The problem with this approach is that nested templates parse differently than top-level templates.  Consider the following example:

<script type=template>
 <b
</script>

In this case, none of the nested template parser modifications apply and we'll parse this as normal for HTML.  That means the contents of the template will be "<b" (let's ignore whitespace for simplicity).

<script type=template>
  <h1>Inbox</h1>
  <script type=template>
    <b
  </+script>
</script>

Unfortunately, the nested template in this example parses differently than it did when it was a top-level template.  The problem is that the characters "</+script>" are not recognized by the tokenizer as an end tag because they are encountered by the nested template fragment parser in the "before attribute name" state.  That means they get treated as some sort of bogus attributes of the <b> tag rather than as an end tag.

OK. Do you believe this to be a serious problem? I feel like inconsistency in the case of a malformed tag is not a very important problem, but perhaps there are cases that would be more obviously problematic, or reasons not obvious to me to be very concerned about cases exactly like this one.

It's going to lead to subtle parsing bugs in web sites, which usually means security vulnerabilities.  :(

Also: can you think of a way to fix this problem? Or alternately, do you believe it's fundamentally not fixable? I've only spent a short amount of time thinking about this approach, and I am not nearly as much an expert on HTML parsing as you are.

I definitely see the appeal of trying to re-use <script> for templates.  Unfortunately, I couldn't figure out how to make it work sensibly with nested templates, which is why I ended up recommending that we use the <template> element.

Another approach we considered was to separate out the "hide from legacy user agents" and the "define a template" operations.  That approach pushes you towards a design like

<xmp>
  <template>
    <h1>Inbox</h1>
    <template>
      <h2>Folder</h2>
    </template>
  </template>
</xmp>

You could do the same thing with <script type=something>, but <xmp> is shorter (and currently unused).  This approach has a bunch of disadvantages, including being verbose and having some unexpected parsing:

<xmp>
  <template>
    <div data-foo="<xmp>bar</xmp>">
      This text is actually outside the template!
    </div>
  </template>
</xmp>

The <script type=template> has similar problems, of course:

<script type=template>
  <div data-foo="<script>bar</script>">
    This text is actually outside the template!
  </div>
</script>

Perhaps developers have a clearer understanding of such problems from having to escape </script> in JavaScript?

All this goofiness eventually convinced me that if we want to support nested templates, we ought to use the usual nesting mechanics of HTML, which leads to a design like <template> that nests like a normal tag.

         (a.i) Consume the token for the close tag named "+script".
         (a.ii) Crate a DocumentFragment containing that parsed contents of the fragment.
         (a.iii) [return to the parent template fragment parser] with the result of step (a.ii) with the parent parser to resume after the "+script" close tag.


This is pretty rough and I'm sure I got some details wrong. But I believe it demonstrates the following properties:
(B) Allows for perfect fidelity polyfills, because it will manifestly end the template in the same place that an unaware browser would close the <script> element.
(C) Does not require multiple levels of escaping.
(A) Can be implemented without changes to the core HTML parser (though you'd need to introduce a new fragment parsing mode).

I suspect we're quibbling over "no true Scotsman" semantics here, but you obviously need to modify both the HTML tokenizer and tree builder for this approach to work.

In principle you could create a whole separate tokenizer and tree builder. But obviously that would probably be a poor choice for a native implementation compared to adding some flags and variable behavior. I'm not even necessarily claiming that all the above properties are advantages, I just wanted to show that there need not be a multi-escapting problem nor necessarily scary complicated changes to the tokenizer states for <script>.

I think the biggest advantage to this kind of approach is that it can be polyfilled with full fidelity. But I am in no way wedded to this solution and I am intrigued at the mention of other approaches with this property. The others I know of (external source only, srcdoc like on iframe) seem clearly worse, but there might be other bigger ones.

The xmp-like wrapper also can be polyfilled, as can approaches based on HTML comments or attributes.  There's a trade-off, however.  In the long view, it's not clear to me how important polyfillability is for a feature.  It certainly makes adoption easier in the short term, but if we constrain ourselves to designing only features that can be polyfilled at each step, we'll end up with a contorted platform.

(D) Can be implemented with near-identical behavior for XHTML, except that you'd need an XML fragment parser.

The downside is that nested templates don't parse the same as top-level templates.

Indeed. That is in addition to the previously conceded downsides that the syntax is somewhat less congenial.


Another issue is that you've also introduced the following security risk:

Today, the following line of JavaScript is safe to include in an inline script tag:

var x = "</+script><img onerror=alert(1)>";

Because that line does not contain "</script>", the string "alert(1)" will be treated as the contents of a string.  However, if that line is included in an inline script inside of a template, the modifications of to the parser above will mean that alert(1) will execute as JavaScript rather than being treated as a string, introducing an XSS vector.

I don't follow. Can you give a full example of how this would be included in a template and therefore be executed?

<script>
x =  "</+script><img onerror=alert(1)>"; // This is safe, there is no script execution.
</script>

<script type=template id=a>
  <script>
    x =  "</+script><img onerror=alert(1)>"; // This is not safe, the alert(1) executes as script.
  </+script>
</script>

<script>
document.body.appendChild(document.getTemplateById("a").instantiate());
</script>

You should imagine, of course, the string not being written literally by the developer but instead generated on the server side by some code that knows how to escape strings for use in inline script tags (e.g., by escaping "\" and "</script").  It's certainly possible to defend against this XSS vector on the server, but it's one more XSS vector to worry about.

I hope this clarifies the proposal.

Notes:
- Just because it's described this way doesn't mean it has to be implemented this way - implementations could do template parsing in a single pass with HTML parsing if desired. I wrote it this way mainly to demonstrate the desired properties/

I'm not sure how we'd be able to that without running multiple copies of the tokenizer state machine in parallel.  The tokenizer states for the template fragment parser aren't going to line up in any meaningful way with the top-level tokenizer's search for an appropriate end tag to escape from the script data states.

I'm pretty confident it's *possible* to do a one-pass version of this algorithm, but I am not sure if it is easy, or if it is desirable.

What I actually imagined (knowing much less about HTML parsing than you) was that you'd enter a different tokenizer state after encountering a <script template> than a <script>. But defining that state would be challenging.

Sure, but you're going to need a 17x bigger tokenizer state machine because you'll need to track all 17 script data states for the top-level tokenizer at the same time as you're tracking all the states for the template fragment tokenzier.

Adam




--
morrita

Received on Thursday, 1 November 2012 17:17:47 UTC