Re: CSP XML Data with tokens from Aryeh Gregor on 2011-02-01 (public-web-security@w3.org from February 2011)

From: Aryeh Gregor <Simetrical+w3c@gmail.com>
Date: Mon, 31 Jan 2011 19:30:20 -0500
To: Michal Zalewski <lcamtuf@coredump.cx>
Cc: "public-web-security@w3.org" <public-web-security@w3.org>
Message-ID: <AANLkTikydL3vm5yK1Ogz1Kwof_1bqxwveC=awmCtpoRM@mail.gmail.com>

On Sun, Jan 30, 2011 at 3:54 PM, Michal Zalewski <lcamtuf@coredump.cx> wrote:
> The whole discussion here started with a discussion over a variety of
> "simple XSS" prevention mechanisms (and I'm not entirely sure why we
> drifted toward sandboxed frames, which I think aren't a very good
> fit). The implicit assumption here - backed with empirical data, by
> the way - is that people can't get "simple" HTML escaping right
> (especially since it gets progressively less simple in cases such as
> JS in inline on* handlers).

No, authors can't get simple HTML escaping right, but that applies to
any case where they have to identify all the specific places where
untrusted content is present.  This applies to replacing untrusted
content by a base64 entity or marking it with special tags or whatever
-- it won't help.  Either you have to improve the authoring tools
(like PHP), or you have to take a different approach (like CSP's
site-wide declarative syntax, which has its own problems).

Actually, I think the best way to fight XSS in practice would probably
be to improve PHP.

On Sun, Jan 30, 2011 at 9:55 PM, Michal Zalewski <lcamtuf@coredump.cx> wrote:
> 1) It's slow and probably will remain so for the foreseeable future -
> because even if you don't need to benefit from the full JS / DOM
> isolation, you can't opt out of it to get an equivalent of <span> that
> does not execute scripts or does not load images. This may be improved
> in future generations of browsers, but the cost of including tons of
> IFRAMEs on a page is likely to be prohibitive. Porting some of the
> security features of sandboxed frames to <span> would be interesting,
> but is not happening; instead, we are probably getting .safeInnerHTML,
> which gives you a JS-only solution.

Why do you think a few dozen iframes with srcdoc will be noticeably
slow?  Have you benchmarked?  A few dozen is enough for forum posts or
blog comments, for instance.

> 2) In these most basic uses, the convenience of srcdoc: / data: just
> isn't there, unless you get strong framework-level support, which is
> definitely not given (and even then, debugging resulting pages is
> painful, there is size overhead for base64).

This doesn't seem like an issue at all with srcdoc.  You just have to
include the content as normal, HTML-escaped.  Syntax highlighters
won't pick it up, but that's about it.  Browsers should even show you
the DOM for the iframe just like normal in their Firebug equivalents.

> 3) The likelihood of messing up base: or srcdoc encoding somewhere is
> probably about the same as that of forgetting to escape text in the
> first place.

Sure.  But simple escaping doesn't let markup through, like bold or
links.  The likelihood of messing up srcdoc encoding is vastly, vastly
lower than the likelihood of messing up server-side HTML sanitization.
 <iframe sandbox> is meant for cases where you want to allow some
markup, so you can't just completely escape the input.

Received on Tuesday, 1 February 2011 00:31:14 UTC