Re: Re: E4H and constructing DOMs

On Thu, Mar 7, 2013 at 5:02 PM, Mike Samuel <mikesamuel@gmail.com> wrote:
[...]

> On Thu, 7 Mar 2013 Adam Barth said

[...]

> > var firstName = [...];
> > var lastName = [...];
> > header.innerHTML = `<h1>Welcome ${ firstName } ${ lastName }!</h1>`;
> >
> > If firstName and lastName are are user-controlled (i.e., untrusted),
> > the above is an XSS vulnerability.  For example, the attacker can set
> > firstName to "<img onerror='alert(/pwned/)'>".
>
> I strongly agree that safety should be the default.
>
> I would very much like the default to be overridable to be a late
> binding producer of string like values that distinguishes trusted
> substrings so that they can be auto-escaped based on context as
> described at
> http://google-caja.googlecode.com/svn/changes/mikesamuel/string-interpolation-29-Jan-2008/trunk/src/js/com/google/caja/interp/index.html


Unfortunately, the debate went on from this point ignoring this agreement
and without anyone seeming to have followed the link. The crucial section
in that document is "Implementing Late Binding String Interpolation" at <
http://google-caja.googlecode.com/svn/changes/mikesamuel/string-interpolation-29-Jan-2008/trunk/src/js/com/google/caja/interp/index.html#-autogen-id-14>.
Because much else has changed since then, it can be hard to read this
section and see how it applies to current quasis. The basic idea, put in
terms of the rest of the modern proposal, is that the default quasi handler
capture the live quasi-handler call arguments and return a stringifiable
record:

class DefaultQuasi {
  constructor(callSiteID, ...substitutions) {
    this.forceQuasi = function(quasiHandler) {
      return quasiHandler(callSiteID, ...substitutions);
    };
    this.toString = function() {
      return stringConcatenator(callSiteID, ...substitutions);
    };
  }
}

where stringConcatenator has the behavior of the currently specified
default quasi handler.

This delays the actual quasi processing until forced, at which time the
context of forcing provides the knowledge of which micro-language to use.
Many languages, including HTML, have many different parsing contexts, each
with its own escaping conventions, etc. The start symbol in the grammar for
each of these parsing contexts forms what I am here calling a
micro-language. By having the default quasi handler delay quasi processing
this way, and to obtain the quasi handler for the micro language from the
quasi handler of the enclosing macro language, the end programmer is
relieved of the need to remember the names of these micro languages. If we
wish to make the DOM safer, we can, for example, enhance the innerHTML
setter so that, if its argument is an instance of DefaultQuasi (or use a
[[Class]] check or whatever), then it forces it with the SAFEHTML quasi
handler.

This is all in Mike's original quasis proposal above. It is still very much
worth reading, even though some translation is needed to apply it to modern
quasis.

All of the names above are expository only. A real proposal would rename
everything into modern terms, e.g., template strings.

Note that the recent discussion of console.log involved a similar delaying
of printing arguments, though for a different reason.

-- 
    Cheers,
    --MarkM

Received on Sunday, 10 March 2013 15:52:37 UTC