More on XSS mitigation (was Re: XSS mitigation in browsers) from Adam Barth on 2011-01-21 (public-web-security@w3.org from January 2011)

From: Adam Barth <w3c@adambarth.com>
Date: Fri, 21 Jan 2011 14:20:09 -0800
To: public-web-security@w3.org
Message-ID: <AANLkTi=JQvCSK8__77=fjs4j+cvDpgjbXf5p7gqXXJft@mail.gmail.com>
I've renamed the thread to try to disentangle the discussion from the
discussion about clickjacking.  If you're interested in discussing
clickjacking, please use a different thread.

On Thu, Jan 20, 2011 at 7:24 AM, Boris Zbarsky <bzbarsky@mit.edu> wrote:
>> 1) Instead of using HTTP headers, the policy is expressed in HTML.
>
> This leaves the door open for various content-injection attacks that inject
> content before the policy <meta>.  Is the benefit of expressing the policy
> in the same file worth it?

Yes, content injection before the <meta> is definitely a risk of this
approach.  I would certainly encourage authors to put the element as
early as possible in their document.

Various folks that I've spoken with have had different opinions on
where to place the policy.  The folks that ask to be able to put the
policy in the document itself seems to fall into two camps.  Either
they don't have control over HTTP headers (e.g., because those are
controlled by their hosting provider or the "ops" team or whatever) or
they're concerned about "header bloat" (e.g., because headers can't be
compressed the way payloads can be compressed).

A third advantage is for settings in which the browser renders HTML
that it receives from a non-HTTP source.  For example, in the Chrome
extension system, authors write HTML, but that HTML is stored in a ZIP
file, not an HTTP server.  In order for these folks to take advantage
of this mechanism, they need a non-HTTP way of expressing the policy.

On Thu, Jan 20, 2011 at 1:47 PM, Brandon Sterne <bsterne@mozilla.com> wrote:
> On 1/19/11 2:42 PM, Adam Barth wrote:
>> I'm not sure if this the right forum for discussing new browser
>> features that help mitigate cross-site scripting.  If not, please feel
>> free to point me to a better forum.
>
> This is exactly the right forum.  Content Security Policy is the
> top-line deliverable listed in the Web Application Security WG proposed
> charter.
>
>> As I'm sure many of you are aware, various folks from Mozilla have
>> proposed Content Security Policies
>> <https://wiki.mozilla.org/Security/CSP> as a way of improving the
>> security of web pages by including a security policy.  I'm interested
>> two aspects of CSP:
>>
>> 1) Cross-site scripting mitigation
>> 2) Notification of policy violations
>
> Just to be clear, as Mozilla's CSP proposal hasn't even been discussed
> on this list yet and I want to make sure we miss that opportunity,

Presumably you mean "don't miss"?  :)

> Mozilla is interested in each of the areas outlined in the proposal.
> Apparently, the two areas that Adam is not interested in are:
>
> 3) Non-script resource loading
> 4) Framing restrictions
>
> Since large amounts of thought and discussion have already been invested
> in these areas outside of this mailing list, they should be part of the
> discussion here to decide if they are worthwhile components of Content
> Security Policy.

To be clear, I think those are valuable as well, they're just not at
the top of my priority list.

>> The simplest design I could think of that achieves those goals is
>> described on this wiki page:
>>
>> https://trac.webkit.org/wiki/HTML%20Security%20Policy
>>
>> The design is largely inspired by CSP, but different in a few ways:
>>
>> 1) Instead of using HTTP headers, the policy is expressed in HTML.  Of
>> course, authors will want to place the policy as early as possible in
>> their document, so we're using a meta element, which can be placed in
>> the head of the document.
>
> I don't think the use of HTML tags instead of HTTP headers is
> well-justified.  The obvious drawback to using <meta> tags is that the
> whole model can be subverted by an attacker who manages to inject his
> attack code or bogus policy tag above the site's legitimate policy tag.
>  Mozilla considered the use of <meta> tags as an alternative to the
> header, but we ultimately decided that the risk outlined above outweighs
> the usability gained by allowing the policy to be expressed as a tag.

I certainly agree that there's a trade-off here.  I put some thoughts
on this topic earlier in this message.  I suspect after some amount of
discussion, we could come to agreement on this point one way or the
other.

>> 2) Instead of exposing policy levers for every kind of resource load,
>> this proposal only lets the author control the source scripts.  This
>> focus on scripts is motivated by wanting to prevent the attacker from
>> injecting script into the page.
>
> Providing strong controls around script loading is certainly the primary
> focus of Content Security Policy.  But since we are already here
> discussing the creation of a new security policy framework, perhaps it
> would be beneficial to consider additional policy levers that can help
> sites protect themselves moving forward.
>
> I'm especially interested in future classes of attacks that we haven't
> yet seen or considered which could potentially be mitigated by such
> additional controls.  We can't now claim that these additional levers
> will or won't mitigate the attacks, but if the marginal cost of
> including them is low then we would do well to build them into the model
> from the start.
>
> It is true that the additional controls will lengthen any spec that this
> group eventually produces, but they do not add any significant
> complexity.  Both Mozilla's and Adam's proposals have already introduced
> the concept of trusted domains for script.  This concept is easily
> extended to other types of content.

I think these paragraphs illustrate the core difference in our
perspectives.  I'd like to try for the simplest possible design that
helps mitigate XSS, whereas you're interested in also addressing
"nearby" topics.

> Frame controls, included in Mozilla's CSP proposal as the
> frame-ancestors directive, also seem to have value.  Microsoft would not
> have introduced the X-Frame-Options header if there wasn't a valid use
> case being addressed.  frame-ancestors offers some slight improvements
> over X-Frame-Options by allowing for a list of trusted domains; a more
> granular approach than the DENY or SAMEORIGIN options of X-F-O.
>
> I will say, though, that neither CSP frame-ancestors nor X-F-O fully
> address the clickjacking threat.  They are both improvements over
> script-based framebusting, but they only allow sites to prevent their
> framing.  We have no current solutions for sites that want to be framed
> but don't want to be clickjacked [1].  This is an area I would love to
> see this group delve into.

I certainly agree that there is some amount of value to addressing
these "nearby" topics, but there is also a cost, mostly in terms of
complexity and in terms of opportunity.

I would like to move forward with the XSS mitigations now and revisit
these other topics in the future.

>> 3) Instead of reporting violations to the server via HTTP, this
>> proposal simply generates a DOM event in the document.  The author of
>> the page can listen for the event and wire it up to whatever analytics
>> the author uses for other kinds of events (e.g., mouse clicks).
>
> Can you explain the motivation for this change?  If sites want to be
> informed of the violations as they occur, won't they set up an event
> handler which sends XHRs back to the server anyway?

Different sites use different mechanism to understand what's happening
on their pages.  For example, firefox.com uses WebTrends to measure
various events on its web pages.  By generating a JavaScript event for
these security violations, we make it easier for web sites to
integrate these notifications with their existing analytics rather
than having to build a one-off system for this purpose.

>> Let me know if you have any feedback on this proposal.  In general,
>> I'm more interested in feedback that leads to simplification rather
>> than feedback that leads to more complexity.
>
> I agree that unjustified complexity should be avoided.  Additional
> complexity may be warranted, however, if it leads to a more expressive
> and, ultimately, useful model for websites.  Einstein famously said
> "Make things as simple as possible, but not simpler."

Certainly.  I'd like to help web sites mitigate XSS, which means I'd
like to make something as simple as possible for that purpose, but no
simpler.

On Thu, Jan 20, 2011 at 2:49 PM, Michal Zalewski <lcamtuf@coredump.cx> wrote:
> Oh, and I noticed that CSP specifies allowable script locations on
> per-origin basis too.
>
> That would make it vulnerable to the two attacks I mentioned in my
> initial response to Adam, right?

The two are similar in that regard, yes.

[...]

> ...and have my payload execute under CSP and under Adam's proposal. In
> browsers that don't support E4X, this is probably also exploitable in
> many cases, especially with text/plain responses, hosted files, etc -
> just marginally harder.

I think many folks agree that E4X is a mistake.  My understanding is
that Mozilla intends to remove it from Firefox.

> This can be fixed by strictly enforcing Content-Type. But it won't
> help you with another case:
>
> http://allowed_origin/some_unrelated_public_js_api?callback=attacker_selected_function
>
> ...which is intended to be included on third-party sites via <script
> src=...>, and returns application/x-javascript structured this way:
>
> attacker_selected_function('api_provided_values')
>
> Such APIs are common on a variety of sites. Google Maps are probably a
> good example, but I also expect Twitter, Facebook, etc, to have
> something along these lines.
>
> Am I correct on this?

These are certainly issues to worry about.

> I honestly think we should be putting a lot more emphasis of
> understanding actual use cases in complex environments for any
> security mechanisms proposed; coming up with unified frameworks,
> rather than disjointed solutions for small subsets of problems (CSP is
> a step in a good direction, but has some shortcomings); and engaging a
> far broader security community... I know this is not a productive
> complaint, and probably not a welcome one, but... :-)

One of thing I'd like to get out of having this discussion here is
engagement with a broader security community for exactly these
reasons.

On Thu, Jan 20, 2011 at 3:26 PM, Michal Zalewski <lcamtuf@coredump.cx> wrote:
>> <http://www.thespanner.co.uk/2009/11/23/bypassing-csp-for-fun-no-profit/>
>
> Yeah, we were also unhappy with E4X for other reasons:
>
> http://code.google.com/p/doctype/wiki/ArticleE4XSecurity
>
> ...but E4X is not the root issue here, it just makes this vector a bit
> more convincing.

Whether or not it's the "root" issue is debatable.  However, E4X
breaks an important security invariant of the web, which is why it's
so useful in constructing attacks.

On Fri, Jan 21, 2011 at 10:02 AM, Lucas Adamski <ladamski@mozilla.com> wrote:
> Yes, excellent point, and that is something we have been doing at Mozilla.  We spent a lot of time communicating with as much of the community and as many of the larger websites that we could get our hands on, and in many respects CSP is a reflection of those conversations.  We ended up putting many of the controls in there and improving on the reporting functionality as a direct result of that feedback.  That said, I think participating in the standards process is the next logical step in obtaining more feedback to improve the model further.
>
> On a conceptual level, I am not really a believer in the current proliferation of orthogonal atomic mechanisms intended to solve very specific problems.  Security is a holistic discipline, and so I'm a big supporter of investing in an extensible declarative security policy mechanism that could evolve as the web and the threats that it faces do.  Web developers have a hard enough time with security already without being expected to master a potentially large number of different security mechanisms, each with their own unique threat model, implementation and syntax.  Not to mention trying to figure out how they're expected to interact with each other... how to manage the gaps and intersections between the models.

Security is a multi-faceted topic that we're unlikely to solve all at
once.  One approach that might work well is to have in mind a broader
vision of where this security mechanism can go, but to make progress
one step at a time.  This general approach has worked well for HTML,
for example.  The first version of HTML didn't have every conceivable
feature under the sun.  It started small and over time grew to be able
to handle more and more use cases.

Adam
Received on Friday, 21 January 2011 22:21:16 UTC