W3C home > Mailing lists > Public > public-web-security@w3.org > February 2011

Re: CSP XML Data with tokens

From: <sird@rckc.at>
Date: Mon, 31 Jan 2011 19:40:11 -0600
Message-ID: <AANLkTimmTYcq9Zk9f+MnmGN868PWLcFKbErOudtruN-X@mail.gmail.com>
To: Michal Zalewski <lcamtuf@coredump.cx>
Cc: Aryeh Gregor <Simetrical+w3c@gmail.com>, "public-web-security@w3.org" <public-web-security@w3.org>
Actually I think the main difference between Michal's point and
sandboxed iframes, is that Michal is trying to solve XSS in it's most
simple case.. While sandboxed iframes try to sandbox HTML.

Michal's point seems to be that

<$untrusted>$user_content</$untrusted>

is easier to get right than

{htmlentities($user_content)}

Which I don't think I agree.. since there are several disadvantages
(like.. getting random right,and changing HTML just for this, as well
as backwards compatibility support).

The other case that Michal also defends is performance. While I think
that it's a valid point, since.. well, an iframe involves creating a
new window, and a js scope, and etc.. I think that this is not such a
big problem, or a show stopper for sandboxed iframes for this use
cases.

For what is worth, the way to use sandboxed iframes would be (for
stuff with HTML support):
<iframe sandbox="allow-same-domain" seamless
srcdoc="{htmlentities($user_data)}"></iframe>

for PHP and some other templates would use:
<iframe sandbox="allow-same-domain" seamless srcdoc="{$user_data |
html}"></iframe>

And this (for html-less content):
{htmlentities($user_data)} / {htmlentities($user_data)}

The XML tokens alternative, while probably would have a better
performance is kinda complex (for instance, sandboxed iframes have
several flags to enable/disable certain features, which would have to
be emulated in the xml data tokens case), it also makes it hard to get
right at an implementation level (I assume that when you add the
"untrusted" tag, then the container automatically gets transformed to
a CDATA container). It will also be hard at writing the
specification.. I mean, everything can be dangerous, a form, style
elements could be made absolute or fixed and conduct phishing attacks,
<base/<basefont/<meta/<title/etc.. tags would have to be forbidden
inside this tokens, <isindex> as well, also maybe the <x
formaction="..."> attribute. What about unknown tags? If there's a
global namespace as parent of the sandboxed frame, the elements apply
(eg if there's a xmlns:svg="..." do svg:XXX tags work?). And the
justify this would add a kinda complex level of complexity (that goes
away if the content gets it's own document, window and browsing
context).

Anyways, If people really think this is the right way to go, I think
the best way to do that would be to use a new namespace like.. (eg:
<sandbox:TOKENHERE> </sandbox:TOKENHERE>) that would break HTML less
(though, it would still be broken :P).

Greetings!!

-- Eduardo




On Mon, Jan 31, 2011 at 6:53 PM, Michal Zalewski <lcamtuf@coredump.cx> wrote:
>> No, authors can't get simple HTML escaping right, but that applies to
>> any case where they have to identify all the specific places where
>> untrusted content is present.
>
> Well, that's sort of speculative; we have given them horrible and
> unintuitive tools to do the job; we are generally not asking them "is
> this untrusted", but "will this contain this type of bad characters".
> It's possible to conclude that this means they can't get it right at
> all, but other explanations are also on the table. That said, I don't
> think there's a whole lot of a point in arguing, especially without
> real data to back any of these gut feelings up ;-)
>
> My only concern is as noted, of the three XSS prevention methods
> outlined few posts ago, I can accept that (1) may not offer benefits
> over (2); but I then don't buy that (3) would, making sandboxed frames
> a no-op from this perspective.
>
>> Why do you think a few dozen iframes with srcdoc will be noticeably
>> slow?  Have you benchmarked?  A few dozen is enough for forum posts or
>> blog comments, for instance.
>
> There are several hundred individual attacker-controlled snippets on
> such a page, typically; punctuated with inline event handlers and so
> forth.
>
>>> 3) The likelihood of messing up base: or srcdoc encoding somewhere is
>>> probably about the same as that of forgetting to escape text in the
>>> first place.
>>
>> Sure.  But simple escaping doesn't let markup through, like bold or
>> links.  The likelihood of messing up srcdoc encoding is vastly, vastly
>> lower than the likelihood of messing up server-side HTML sanitization.
>
> Yes; I never stated that sandboxed frames are useless for this. I
> think it's their strong suit. But it's a very small blip on the XSS
> radar.
>
> /mz
>
>
Received on Tuesday, 1 February 2011 01:41:04 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 1 February 2011 01:41:05 GMT