Re: HTML imports: new XSS hole?

As with any new feature, there's the risk of introducing new security bugs
on applications that otherwise wouldn't have them. The usual argument goes
as follows:

Browser vendors have a lot of undocumented functionality, and it would be
foolish to create a blacklist approach on content filtering, since you
can't possibly know what are all the things the browser does.

There is a common counter-argument is that in some cases, you might
certainly ask to browser vendors "is doing X safe"? And they will be able
to say yes.

This is the perfect example of that.

Say you have a website, and you have a whitelist-based content filter. You
want to allow your users to run arbitrary CSS, so you HTML-sanitize their
content, and allow <link> tags. You thorougly check that the user's browser
is Chrome or Firefox latest version, and even, before load, you do a
runtime check to ensure that they are up-to-date and safe.

Now, CSS now a days in modern browsers (even Opera) is relatively safe
against JavaScript injection attacks. Sure, there are bugs every once in a
while, but browsers have been killing those features slowly and steadily.

So, this guy (let's call him Mark) comes to Blackhat and find the "security
guy" from Firefox and the "security guy" from Chrome, and hey, why not,
even a "web security guy" from Internet Explorer. Let's call them Dave,
Chris and Jesse. And they ask them during the Microsoft Party.. "hey guys,
I want to make the internet more fun and allow people to run arbitrary CSS.
If I make sure to strip all PII from the document I'm injecting the CSS to,
there shouldn't be any way for the user to attack other parts of my web app
right???". And they all look at each other, think "what is this guy doing?
and why doesn't he have a wristband?" and eventually say "you should use
seamless iframe sandboxes". And he goes home, and make a big company based
on that promise.

Now, fast-forward 2014. Turns out, he used iframe sandbox="allow-scripts
allow-same-origin" because he wanted to append an event handler to the
sandboxed iframe content from the outer document, and until today, that
would have been safe.. because there was no PII to leak from that site
(perhaps.. visited state for links in some browsers?). He also, foolishly
assumed that since <link> tags can only be *really* used for CSS, he didn't
have to check for "rel", since, well, you know, CSS was "the worst thing
you could do" from an iframe. He knew to remove secret data from the
document, since he read the existing literature and he learnt that you can
exfiltrate data with CSS, but he saw, as mentioned online many times, that
CSS-based XSS doesn't yield JavaScript execution anymore in modern browsers.

Now, fast-forward 2015. Some guy, let's call him Mario documents this
feature in a website, say, html5sec.org, and another guy, let's call him
Alex, is bored one weekend, and decided to well, go on a rampage and own
Mark's website to it's knees. Mark, baffled, can't understand what
happened. He literally followed the security advice from 3 of the top
browser vendors. WHYYY!?!?

He would be right to be upset. I think. We can't really expect Mark to know
about all obscure browser features and how the rest of the internet has to
evolve around them.

Well, turns out he is NOT right. Mark made three mistakes:
 1. He went to BlackHat seeking security advice. BlackHat isn't really the
place you go for learning about secure coding practices. Also, you
shouldn't go to a party that requires you to use a wrist band.
 2. He misused iframe@sandbox. allow-same-origin and allow-scripts probably
shouldn't be allowed together.. they make little sense (or if they are
allowed together, they should be making it clearer that all security
benefits went down the toilet).
 3. Finally, and most importantly, he designed a security feature by
himself, and decided not to be kept up to date (said in a different way, he
should be subscribed to some of these mailing lists).

Now, I'm not sure how many have tried to implement an HTML sanitizers. Even
a whitelist-based one has the following problems:
 1. You have to write a parser OR You have to use a third-party parser.
  1.1 This has the problem of writing-your-own is a headache, and you will
get it wrong. If you use a third-party parser, it'll most likely try to be
as lenient as flexible as possible, accepting malformed input (for
"compatibility" yay!).
 2. You have to get a serializer.
  2.1 This is way harder than the parser. Even browsers get it wrong (and
the FSM shall bless you if you need to write a serializer for CSS).
 3. You need a sane whitelist.
  3.1 And the whitelist, apparently, needs to be aware of not just
<tag/attribute> pairs, but also <tag/attribute + rel="stylesheet"> geez!

That is to say, no one really has a good HTML sanitizers. Everyone either
over-protects, or has XSS. Possibly a mixture of both. So what can poor
Mark do?

I personally think that HTML imports are a nice feature (and I mean, I'm
legitimately happy about it, they sound pretty cool), and I think they
should launch and get implemented and such.

However, we need to get a solution for the poor Marks of the world. There
are a lot of problems that people out there are trying to solve, and they
sometimes need to get things a bit over the extreme. At the moment the
"Security Guarantees" provided by different APIs are spread across the
specs and are mostly living on random people's heads. If we could codify
what *is* meant to be safe as an organization, then next time Mark goes to
a party, he can be told to read the 10 commandments of HTML sanitization
(and yes, I'm saying that "go use HTML Purifier" is the wrong answer).
Let's go one step further, and do this for other things! And whenever we
launch a new feature, we can go see that document. If it breaks the 10
commandments, then, well, it's a sin.

Received on Tuesday, 3 June 2014 03:18:45 UTC