Re: CSP XML Data with tokens from Michal Zalewski on 2011-01-31 (public-web-security@w3.org from January 2011)

From: Michal Zalewski <lcamtuf@coredump.cx>
Date: Sun, 30 Jan 2011 18:55:20 -0800
To: "sird@rckc.at" <sird@rckc.at>
Cc: Giorgio Maone <g.maone@informaction.com>, Adam Barth <w3c@adambarth.com>, Gareth Heyes <gazheyes@gmail.com>, Devdatta Akhawe <dev.akhawe@gmail.com>, Brandon Sterne <bsterne@mozilla.com>, "public-web-security@w3.org" <public-web-security@w3.org>
Message-ID: <AANLkTi=iqWUiz=k_Aoa3RLmCjzOK2ZwRt9w7CFt3c9xF@mail.gmail.com>

> It is backwards compatible, it's html encoded.. which makes it a no-op
> on old UAs.

In that case, this undermines the premise of this thread (if there is
any ;-): you still need to apply a transformation to the displayed
text. So, this is still not a tool that helps you with the most
rudimentary class of XSS vulnerabilities, which can be in principle
already prevented by a proper use of encoding - just aren't.

> Either way, given that there are viable alternatives, which are
> already defined and were discussed at length at the HTML WG (and were
> changed a few times already to fit more use cases). Is it really worth
> creating yet another one just because UAs may consume more CPU?
> [...]
> Because sandbox iframes are supposed to solve XSS by providing authors
> the tools to sandbox HTML correctly.

My impression is that the purpose of sandboxed IFRAMEs is, as the name
implies, sandboxing HTML documents, so that:

1) They can't navigate the top-level window or mess with it in other
ways when not same origin (useful for gadgets and ads),

2) They are isolated even if same-origin, which removes some of the
burden associated with building HTML sanitizers; although it's worth
noting that this aspect can be already approximated by just using a
separate domain for non-sanitized HTML content; and that it's not
backward compatible with MSIE6.

The niche case of HTML sanitizers aside, I do not see the immediate
applicability of sandboxed frames to straightforward (i.e.,
text-based) XSS prevention for inline contents of the page. While they
can be abused this way (srcdoc or data:), I honestly think this method
is not particularly useful (as I already mentioned on whatwg and
elsewhere). There are three reasons for this:

1) It's slow and probably will remain so for the foreseeable future -
because even if you don't need to benefit from the full JS / DOM
isolation, you can't opt out of it to get an equivalent of <span> that
does not execute scripts or does not load images. This may be improved
in future generations of browsers, but the cost of including tons of
IFRAMEs on a page is likely to be prohibitive. Porting some of the
security features of sandboxed frames to <span> would be interesting,
but is not happening; instead, we are probably getting .safeInnerHTML,
which gives you a JS-only solution.

2) In these most basic uses, the convenience of srcdoc: / data: just
isn't there, unless you get strong framework-level support, which is
definitely not given (and even then, debugging resulting pages is
painful, there is size overhead for base64).

3) The likelihood of messing up base: or srcdoc encoding somewhere is
probably about the same as that of forgetting to escape text in the
first place.

This is not to say that sandboxed frames are bad; but that yeah, if we
want to make an impact for simple XSS, there are other things that
probably need to be done, and sandboxed frames are unlikely to matter
there. Since XML is not receptive to any inline approach, this is more
of a pipe dream that a meaningful discussion.

The SPDY-like approach of sending parsed DOM tree to the browser,
instead of serialized HTML, makes sense for performance reasons, and
will probably be eventually seriously researched on these grounds;
getting something in at that point might be a worthy pursuit. But I
don't think it's going to be welcome as a security solution alone.

/mz

Received on Monday, 31 January 2011 02:56:13 UTC