W3C home > Mailing lists > Public > public-script-coord@w3.org > January to March 2013

Re: Contextual auto-escaping corner cases

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Thu, 14 Mar 2013 19:11:25 -0700
Message-ID: <CAAWBYDDJFnZsmvQyg-oEg91djH89xVVQu9ABN4Kj96J4_W-m3w@mail.gmail.com>
To: mikesamuel <mikesamuel@gmail.com>
Cc: "public-script-coord@w3.org" <public-script-coord@w3.org>, Ian Hickson <ian@hixie.ch>
On Thu, Mar 14, 2013 at 1:46 PM, Mike Samuel <mikesamuel@gmail.com> wrote:
> 2013/3/12 Tab Atkins Jr. <jackalmage@gmail.com>:
>> Ian provided several examples of code where it seems like it would be
>> impossible to auto-escape properly, and an author relying on
>> auto-escaping because they've been trained that it works elsewhere
>> could be easily misled and inadvertently cause an XSS vulnerability.
>> Could you go over those and answer how you think your ideas for
>> auto-escaping would address the problems he raised?
>
> 2013/3/12 Ian Hickson <ian@hixie.ch>:
>> What would be autoescaped in something like:
>>
>>    h`<img src="${scheme}://${host}:${port}/${path}/${file}.${ext}"
>>          srcset="${file1} ${w1}w ${file2} ${w2}w"
>>          alt="${alt}"
>>          data-logger-url="logger?id=${id}&key=1234">
>>
>> ...? (where h`` is your autoescaper; obviously pretend that part is the
>> done however your syntax would really work, and strip newlines if
>> necessary, obviously.)
>
> The parts in the src are all URI encoded.  Any parts that appear after
> a literal '?' or '#' are encoded so as to prevent parameter splitting.

That implies that it's impossible to put in a url with ? or # in it, right?

It doesn't help the srcset at all, even though the browser knows that
it accepts urls.

Are you claiming that literal ? or # in the data-logger-url case cause
parameter encoding?  Or were you referring solely to the src part, and
the rest are completely unescaped?

> In the closure-templates and Go versions, we have heuristics to let us
> determine if custom attributes or data-* attributes are URL content.
> This was based on an inspection of template code prior to the
> introduction of contextual auto-escaping, and since Closure templates
> are compiled statically it allows our pen-testers to keep a list of
> known attributes that pass the heuristic and flush out new
> non-standard attributes that don't.

I doubt we want to put in heuristics for a standard escaper that looks
for attribute values where the literal part "looks like" a url.  That
sounds extremely scary, since a relatively small change in what parts
of the url are contained in the literal segment could potentially make
it stop recognizing.

>> How about this:
>>
>>    x`<img width="${width}"
>>           src="${profile.cgi?username=${username}&size=${width}}">
>>      <script>
>>       var x = new Image(${width});
>>       x.src = 'profile.cgi?username=${username}&size=${width}';
>>      </script>`;
>
> Quite.  We really need an intercession layer for the DOM that lets us
> intercept assignments to sensitive properties and do late-binding of
> escaper to templates.  Yay proxies.

I don't think you understand this example properly.  The template
creates the img *and* the script.  There's nothing there to late-bind.

>> How about:
>>
>>    x`<p>Paste this WLAML command: AB=2%\*2*11*22;GA=${GADATA}*41</p>`
>
> Social engineering will affect all technical solutions as shown in
> this E4H template
>
> <>{x}</>
>
> with
>
> x = "Paste this into your URL bar : javascript:pwnMe()"

I believe the point here was not social engineering, but to point out
something that is thematically similar to a URL, and that thus might
be expected by engineers to be as "safe" as a url is (not needing
manual escaping), when that is actually insecure.  The "paste" part is
irrelevant - just filler text in the example to introduce why there
might be such a command put into page text.

~TJ
Received on Friday, 15 March 2013 02:12:12 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 8 May 2013 19:30:09 UTC