Re: E4H and constructing DOMs from Ian Hickson on 2013-03-12 (public-script-coord@w3.org from January to March 2013)

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 12 Mar 2013 20:16:07 +0000 (UTC)
To: Mike Samuel <mikesamuel@gmail.com>
cc: "public-script-coord@w3.org" <public-script-coord@w3.org>
Message-ID: <Pine.LNX.4.64.1303121943430.15713@ps20323.dreamhostps.com>
On Mon, 11 Mar 2013, Mike Samuel wrote:
> 2013/3/11 Ian Hickson <ian@hixie.ch>:
> > On Mon, 11 Mar 2013, Mike Samuel wrote:
> >>
> >> Ok.  So it's not a goal of E4H to be safe against XSS by default 
> >> then.
> >
> > Autoescaping isn't safe by default either, by that definition.
> 
> URLs are kind of a large hole, and, yes, contextual auto=escaping is 
> safe by that definition.

What would be autoescaped in something like:

   h`<img src="${scheme}://${host}:${port}/${path}/${file}.${ext}"
         srcset="${file1} ${w1}w ${file2} ${w2}w"
         alt="${alt}"
         data-logger-url="logger?id=${id}&key=1234">

...? (where h`` is your autoescaper; obviously pretend that part is the 
done however your syntax would really work, and strip newlines if 
necessary, obviously.)

Or this:

   x`<div style="color: ${colorModeA}"
          data-style-mode-a="color: ${colorModeA}"
          data-style-mode-b="color: ${colorModeB}"
          data-style-mode-c="color: ${colorModeC}"></div>`

...where script switches in the new style="" attribute values dynamically 
based on e.g. some game state?

How about this:

   x`<img width="${width}"
          src="${profile.cgi?username=${username}&size=${width}}">
     <script>
      var x = new Image(${width});
      x.src = 'profile.cgi?username=${username}&size=${width}';
     </script>`;

How about:

   x`<p>Paste this WLAML command: AB=2%\*2*11*22;GA=${GADATA}*41</p>`

The utter lack of escaping in the cases above should set off alarm bells, 
but to authors who have been desensitised due to autoescaping, it'll look 
perfectly safe and we'll have a bunch of XSS (or other injection bugs, as 
in the last case) on our hands.


> What do you and Adam mean by "safe" when you say "safe by default"?

I was just using it in the way that you used it. I would be fine with not 
using the term at all.


> > E4H's design goals were:
> >
> >  - to provide compile-time syntax checking for in-script DOM tree creation
> 
> A laudable goal.
> Contextual auto-escapers provide some level of this.

I haven't seen any proposal that requires browsers to fail to compile code 
that contains syntactically incorrect fragments. Do you have an example of 
what you mean? Which proposal does that?


> > [...]
> >     * avoid using the HTML parser
> 
> I understand the first two goals.  The last seems to be confusing a 
> design choice with a design goal since not using an available tool is 
> rarely something of direct benefit to the end user.

The HTML parser is an utter disaster. It's slow, it's big, it's 
ridiculously complicated. It does stuff you'd never guess at without an 
intimate knowledge of the requirements. Using it is not a feature.


> >  - to have good security characteristics:
> >     * provide a model that is conceptually simple
> >     * allow arbitrary strings to be embedded in DOM trees in a way that
> >       does not allow arbitrary elements or attributes to be created
> 
> If even
>     <a href="{...}">
> is a foot gun then I think it fails at this goal.

Which goal does it fail? The model is simple, and you can't create 
arbitrary elements or attributes. Obviously if you're inserting a string 
into a context where it will be parsed, you have to make sure it's valid 
data, but whitelisting like that is elementary, and applies in all cases, 
including many where there's just no way you could autoescape because the 
data/syntax you're inserting into is app-specific.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 12 March 2013 20:16:32 UTC