W3C home > Mailing lists > Public > public-script-coord@w3.org > January to March 2013

Re: Contextual auto-escaping corner cases

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 15 Mar 2013 15:25:38 -0700
Message-ID: <CAP2znoYuUDSPGsOcWw+DU26+xZrr0pNuuAf5N3sr0x=xFi4GQg@mail.gmail.com>
To: mikesamuel@gmail.com
Cc: "Tab Atkins Jr." <jackalmage@gmail.com>, "public-script-coord@w3.org" <public-script-coord@w3.org>
On Thu, Mar 14, 2013 at 1:46 PM, Mike Samuel <mikesamuel@gmail.com> wrote:

> > What would be autoescaped in something like:
> >
> >    h`<img src="${scheme}://${host}:${port}/${path}/${file}.${ext}"
> >          srcset="${file1} ${w1}w, ${file2} ${w2}w"
> >          alt="${alt}"
> >          data-logger-url="logger?id=${id}&key=1234">
> >
> > ...? (where h`` is your autoescaper; obviously pretend that part is the
> > done however your syntax would really work, and strip newlines if
> > necessary, obviously.)
> The parts in the src are all URI encoded.

How? Each part needs a different kind of encoding. How do you know that
${scheme} is supposed to be a scheme? How do you know whether to allow "@"
or ":" in ${host}?

Given that scheme, host, port, path, file, and ext are all already
sanitised and escaped, how do you avoid corrupting the filename,
overescaping "%20" to "%2520"?

Autoescaping on the src="" line here *introduces bugs*.

If you don't autoescape in srcset="" or data-logger-url, then the author
relying on autoescaping means that the code is now vulnerable. All because
autoescaping lulls the author into a false sense of security.

But if you _do_ autoescape in srcset, then how do you know how to do it?
What if it was:

   srcset="${set1} ${set2}"

...where set1 contains "a.png 100w 1x, b.png 100w 2x" and set2 contains
"c.png 1x, d.png 2x"? Do you know escape the spaces, corrupting the data?
What if set1 contains "?" and set2 contains "2x"? Do you escape the "?"?

> > Or this:
> >
> >    x`<div style="color: ${colorModeA}"
> >           data-style-mode-a="color: ${colorModeA}"
> >           data-style-mode-b="color: ${colorModeB}"
> >           data-style-mode-c="color: ${colorModeC}"></div>`
> >
> > ...where script switches in the new style="" attribute values dynamically
> > based on e.g. some game state?
> This is no different in principle than the first.  Closure templates
> does not include heuristics for style, since it never showed up in any
> template code -- web devs manipulate the class attribute when they
> want to switch styling.

How is an author supposed to know when something is safe and when it's not,
if the author even thinks about it?

> How about this:
> >
> >    x`<img width="${width}"
> >           src="${profile.cgi?username=${username}&size=${width}}">
> >      <script>
> >       var x = new Image(${width});
> >       x.src = 'profile.cgi?username=${username}&size=${width}';
> >      </script>`;
> Quite.  We really need an intercession layer for the DOM that lets us
> intercept assignments to sensitive properties and do late-binding of
> escaper to templates.  Yay proxies.

That doesn't answer the question of what happens here.

> > How about:
> >
> >    x`<p>Paste this WLAML command: AB=2%\*2*11*22;GA=${GADATA}*41</p>`
> Social engineering will affect all technical solutions as shown in
> this E4H template
> <>{x}</>
> with
> x = "Paste this into your URL bar : javascript:pwnMe()"

Your response is a non-sequitur. It's an injection attack, the attacker
controls the GADATA part, not the "paste" part.

Suppose WLAML is a language for controlling a CAD/CAM lathe. ${GADATA}
contains the value for a pattern to use in the lathing. It might contain
"*" characters, but they must be escaped with "\" or they'll be
misinterpreted as the speed value, which has been set here to "41" in the
GA parameter (and to 11*22 in the AB parameter).

An author who is used to autoescaping just won't think about the fact that
the autoescaper has no clue what's going on here. So they'll right the code
above, and an attacker can inject "0*99999" as the lathing pattern, and the
unsuspecting victim tries to put that in their lathe, and the lathe spins
so fast that the blade is ejected out of the device and breaks a window.

I just don't see how any autoescaper can ever get this reliably right often
enough to actually be a net increase in security. IMHO, autoescaping is
actively harmful for security, because it leads authors to rely on
something unreliable, makes it so they think they don't have to understand
what's going on, and makes debugging harder via behind-the-scenes "magic".
Any time you can't tell what's actually going on just by looking at the
source code, you are going to have bugs. This is an area where bugs can
lead to a disaster (such as XSS).

Ian Hickson
Received on Friday, 15 March 2013 22:26:06 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 8 May 2013 19:30:09 UTC