- From: Mike Samuel <mikesamuel@gmail.com>
- Date: Fri, 8 Mar 2013 16:28:27 -0500
- To: Adam Barth <w3c@adambarth.com>
- Cc: Anne van Kesteren <annevk@annevk.nl>, Rick Waldron <waldron.rick@gmail.com>, Adam Klein <adamk@chromium.org>, Ojan Vafai <ojan@chromium.org>, Brendan Eich <brendan@secure.meer.net>, Ian Hickson <ian@hixie.ch>, "rafaelw@chromium.org" <rafaelw@chromium.org>, Alex Russell <slightlyoff@chromium.org>, "public-script-coord@w3.org" <public-script-coord@w3.org>, "Mark S. Miller" <erights@google.com>
2013/3/8 Mike Samuel <mikesamuel@gmail.com>: > 2013/3/8 Adam Barth <w3c@adambarth.com>: >> tl;dr: No one is disputing that string templates as currently designed >> are insecure by default and will lead authors to write code filled >> with XSS vulnerabilities. I recommend removing string templates for >> the spec until these security issues are resolved. > > I oppose this on the grounds that it is better than current ad-hoc > content creation practices, and can lead to a principled solution in a > way that AST approaches cannot. > >> (Consolidating replies---responses inline.) >> >> On Thu, Mar 7, 2013 at 6:36 PM, Rick Waldron <waldron.rick@gmail.com> wrote: >>> On Thu, Mar 7, 2013 at 9:15 PM, Adam Barth <w3c@adambarth.com> wrote: >>>> Linking to a thousand-line JavaScript library as evidence that string >>>> template can be used securely pretty much proves my point: it's hard >>>> to use string templates securely. That means that most authors won't >>>> use them securely and will write code that's full of XSS. >>> >>> I'd like to kindly ask that you stop approaching this conversation as though >>> browsers and the web are the only client of the EcmaScript specification. >>> The language serves to provide primitives that can be used to compose higher >>> level abstractions, eg. DOM APIs with whatever level of security the domain >>> problem requires. >> >> That's a nice strawman, but I'm not approaching this conversation as >> through browsers were the only clients of ECMAScript. What I'm saying >> is that the current design is insecure when used in browsers and >> because browsers are a large user of ECMAScript, we shouldn't include >> a language feature that gives web authors a giant security footgun. >> >> On Thu, Mar 7, 2013 at 7:40 PM, Mark S. Miller <erights@google.com> wrote: >>> Hi Ian, this seems a misunderstanding or non-sequitur. Mike and Rick's point >>> is not to compromise, it is to do something solid and general purpose, to >>> avoid injection bugs in a variety of DSL scenarios, not just HTML. Even in >>> the browser, JS is sometimes used to compose SQL that is sent to the server. >>> It isn't the browser's business to understand SQL, but we can provide a >>> mechanism that is as useful for SQL, again, without compromise. >> >> String templates, as currently designed, are bad for constructing SQL >> statements too. When used in their default mode (which is the most >> common way that authors will use them), they lead to SQL injection >> vulnerabilities. Instead, we should use an approach analogous to >> prepared statements, which are much less likely to lead to SQL >> injection. > >>> Adam, I think you miss the point of Mike's message rather completely. This >>> thousand line JS library has to be done for HTML once, not once per usage. >>> It is complicated because HTML is complicated. And the amount of code >>> compares quite favorably to the browser's HTML implementation, which is much >>> more security critical than this. In any case, if the HTML quasi-parser is >>> provided by the browser platform as standard equipment, it can probably >>> reuse some of the browser's existing mechanisms, to help keep these two HTML >>> systems in sync. >> >> It doesn't matter how many times the library needs to be authored or >> by whom. If we need a thousand lines of JavaScript to compensante for >> the by-design insecurity of string templates, then we've failed as >> language designers. Instead, ECMAScript should have a templating >> system that is secure-by-design and by default instead of >> insecure-by-design-and-default-but-can-be-patched-with-a-thousand-line-library. > > You simply do not have the power to force people to write secure code by fiat. > > JavaScript will have string concatenation via the (+) operator > regardless of whether string templates are in the language or not and > whether or not E4H or some other AST approach were speced and ready to > ship. > > A language feature only contributes to security if it's used instead > of insecure methods. > > If we are, as I believe we both want to, make a dent in the morass of > XSS produced by the PHP set then we need > 1. something that is easy to migrate to from ad-hoc approaches, > preferably piecemeal. > 2. something that can provide provable guarantees > 3. something that is syntactically more attractive (e.g. due to > expressiveness, succinctness) > > >>> As for whether the output of the HTML quasi-parser is an AST or an encoded >>> string, that is up to the quasi-parser designer. The quasi-literals in E >>> generally generated ASTs. Mike convinced me he can generate encoded strings >>> directly as safely and faster, if the point is to eventually produce an >>> encoded string. I'm happy either way. Both decisions are perfectly >>> compatible with the design on quasis, er, template strings, as speced in >>> draft ES6. >> >> That's nice, but the default mode for string templates works for HTML >> but is insecure. That means authors will write code filled with XSS >> because they'll just use the default mode. > > Unless the default mode can be overridden within a scope via a single > line of code. > > I have experience with changing the semantics of a template language > with a large extant codebase (Google+) to use contextual > auto-escaping. > >> What you've written in this paragraph is even more scary. You're >> saying that string templates are so poorly designed that they guided >> you, a world-renowned security expert, into using an extremely complex >> (and therefore unlikely to be secure) design. Surely authors who are >> not world-renowned security experts will fare even worse. > > That you have not made an effort to understand the approach does not > mean that it is "extremely complex" but even if you did engage with > our actual arguments, this would still be wrong. > Any AST approach that is going to work is going to similarly require > encoders if it is to do any of the following: > 1. Handle embedded content without rewriting the DOM to not use DOMString I should point out that no AST approach baked into the browser can comprehensively handle embedded content because of the way translation hooks ( http://wiki.ecmascript.org/doku.php?id=harmony:module_loaders#translation_semantics ) work so this idea of baking an AST approach into the browser is either unsecurable or drastically limits the future evolution of the language. > 2. Allow content to be serialized for storage or cross-frame communication. > 3. Be usable in the absence of a browser. > > >> On Thu, Mar 7, 2013 at 7:57 PM, Jonas Sicking <jonas@sicking.cc> wrote: >>> On Thu, Mar 7, 2013 at 5:55 PM, Mike Samuel <mikesamuel@gmail.com> wrote: >>>> That doesn't apply since this is not parsing, it is lexing, and >>>> regular expressions can be used to lex HTML. >>> >>> Actually, no you can't. For example the lexing of contents of <script> >>> elements is quite complex. >> >> It's mathematically impossible. You need a stack to keep track of the >> foreign content mode (i.e., whether we're tokenizing HTML, SVG, or >> MathML). Without that information, you can't tell who the tokenizer >> will parse apparent CDATA sections. > > >> On Thu, Mar 7, 2013 at 8:36 PM, Mike Samuel <mikesamuel@gmail.com> wrote: >>> I talk about different kinds of developers (library authors, >>> application authors) writing code and you say things that suggest to >>> me that you think the bulk of web developers are going to be writing >>> large amounts of security-critical code. >>> In your view, who is writing what code with the string templates approach? >> >> It doesn't matter who writes the thousand-line library. The fact that >> you need a thousand-line library to use string templates securely >> (even assuming that the library is correct!) demonstrates that the >> design itself is insecure and should not be part of ECMAScript. >> Instead, we should design a templating system that doesn't need a >> thousand-line library to be used securely. > > Why should we deploy such a secure templating system as part of the > language and not as a library? > What if we spec something secure and get it wrong? > > What you keep calling a 1k line library is a strawman too. You ignore > that I am actually advocating a grammar-driven approach. You can > dismiss that without taking the effort to understand it, but you > cannot do that and claim that an insecurable AST approach is > preferable to it. > > >>> Under the AST model, who is writing what code? What portion of an AST >>> approach needs to involve spec-producing committees? >> >> I'm not advocating E4H, but as an example, in E4H no one needs to >> write a thousand-line library. The spec itself is two printed pages: > >> http://www.hixie.ch/specs/e4h/strawman >> >> I'm not claiming that E4H is secure is all cases. I'm just claiming >> that the "hello, world" template is secure by default. For string >> templates, the "hello, world" template is XSS. > > E4H is simple and wrong. It does not deal with embedded languages. > It also is not a good migration target for existing code. > >>> What is your exemplar of the AST model (if not Yesod) and what is your >>> plan to cause the bulk of web programmers to do things using an AST >>> approach instead of using ad-hoc string approaches? >> >> Personally, my favorite AST-style template system is Haml because the >> templates themselves are beautiful. I don't think we should include >> Haml in EMCAScript as-such because Haml has a bunch of Ruby-ism (e.g., >> self-quoting strings for attribute names). I believe we could come up >> with something similar to Haml that felt like a natural extension of >> ECMAScript. > >> The above paragraph is somewhat off topic. At the moment, I'm arguing >> that we should remove string templates from the spec because they are >> insecure. Once we do that, we can have a discussion about what to >> replace them with. (I imagine that discussion will take a fair bit of >> time since there are many details that people will want to bikeshed.) > > My argument is that we need to produce minimal language extension > points to allow experimentation by security researchers. > > >> On Thu, Mar 7, 2013 at 9:59 PM, Jonas Sicking <jonas@sicking.cc> wrote: >>> I thought that one of the points with quasis was that they would allow the >>> above to be interpreted such that firstName and lastName was inserted as >>> text content. I.e. the quasi handler could avoid parsing the contents of >>> those values as HTML and instead just inset them as text content. >> >> In my example, I used string templates (aka quasis) in their default >> mode, which is insecure, hence my claim that string templates are >> insecure by default. Mike Samuel claims that he has written a >> safeHTML quasis handler, which takes about a thousand lines of >> JavaScript, hence my claim that string templates are difficult to use >> securely. > > > >>> This would mean that the HTML quasi would by default be resilient against >>> HTML-injection. >> >> Even if we had a secure HTML quasi handler, the HTML quasi handler >> would not be the default handler. That means the templating system is >> insecure by default. >> >>> To supplement this behavior we could allow the quasi to take special values >>> which would be passed to the HTML parser "like normal" and thus be parsed. >>> I.e. something like >>> >>> HTML`<h1>Welcome ${ asUnsafeHTML(firstName) } ${ lastName }!</h1>` >>> >>> In this case the asUnsafeHTML function would return an object which was >>> recognized by the HTML quasi as "should be parsed" and would contain a >>> property which holds the string value passed in the first argument. >>> >>> Since no parsing would take effect at the asUnsafeHTML callsite, and instead >>> would happen while the rest of the quasi was parsed, all of the normal >>> contextual parsing rules would apply. >>> >>> This way the quasi should by default be as safe as an AST template system, >>> while allowing the page to opt in to more feature full, less safe >>> templating. >>> >>> We could even provide functions like asSafeHTML which would trigger the >>> quasi to parse that piece of content using rules that prevent only "safe" >>> elements. >> >> None of the above solves the problem that string templates as >> currently designed are insecure by default and will lead authors to >> write code filled with XSS vulnerabilities. >> >> Adam >>
Received on Friday, 8 March 2013 21:29:00 UTC