- From: Mike West <mkwst@google.com>
- Date: Tue, 18 Oct 2016 10:05:23 +0200
- To: Artur Janc <aaj@google.com>
- Cc: Devdatta Akhawe <dev.akhawe@gmail.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>, Christoph Kerschbaumer <ckerschbaumer@mozilla.com>, Frederik Braun <fbraun@mozilla.com>, Scott Helme <scotthelme@hotmail.com>, Lukas Weichselbaum <lwe@google.com>, Michele Spagnuolo <mikispag@google.com>, Jochen Eisinger <eisinger@google.com>
- Message-ID: <CAKXHy=f30cUWV1QOFO2VBN3yUUSf+3s6eX9647GR9f4oBtgwZw@mail.gmail.com>
On Tue, Oct 18, 2016 at 1:03 AM, Artur Janc <aaj@google.com> wrote: > On Mon, Oct 17, 2016 at 7:15 PM, Devdatta Akhawe <dev.akhawe@gmail.com> > wrote: > >> Hey >> >> In the case of a third-party script having an error, what are example >> leaks you are worried about? >> > The same kinds of issues that lead us to sanitize script errors for things loaded as CORS cross-origin scripts: https://html.spec.whatwg.org/#muted-errors. If the resource hasn't opted-in to being same-origin with you, script errors leak data you wouldn't otherwise have access to. > Thanks for the summary, Mike! It's a good overview of the issue, but I'd > like to expand on the reasoning for why including the prefix of an inline > script doesn't sound particularly scary to me. > Thanks for fleshing out the counterpoints, Artur! > Basically, in order for this to be a concern, all of the following > conditions need to be met: > > 1. The application has to use untrusted report collection infrastructure. > If that is the case, the application is already leaking sensitive data from > page/referrer URLs to its collector. > "trusted" to receive URLs doesn't seem to directly equate to "trusted" to store sensitive data. If you're sure that you don't have sensitive data on your pages, great. But you were also presumably "sure" that you didn't have inline script on your pages, right? :) > In fact, I'd be much more worried about URLs than script prefixes, because > URLs leak on *any* violation (not just for script-src) and URLs frequently > contain PII or authorization/capability-bearing tokens e.g for password > reset functionality. > We've talked a bit about URL leakage in https://github.com/w3c/webappsec-csp/issues/111. I recall that Emily was reluctant to apply referrer policy to the page's URL vis a vis the reporting endpoint, but I still think it might make sense. > 2. The application needs to have a script which includes sensitive user > data somewhere in the first N characters. FWIW in our small-scale analysis > of a few hundred thousand reports we saw ~300 inline script samples sent > by Firefox (with N=40) and haven't found sensitive tokens in any of the > snippets. > Yup. I'm reluctant to draw too many conclusions from that data, given the pretty homogeneous character of the sites we're currently applying CSP to at Google, but I agree with your characterization of the data. Scott might have more data from a wider sampling of sites, written by a wider variety of engineering teams (though it's not clear that the terms of that site would allow any analysis of the data). > 3. The offending script needs to cause a CSP violation, i.e. not have a > valid nonce, meaning that the application is likely broken if the policy is > in enforcing mode. > 1. Report mode exists. 2. Embedded enforcement might make it more likely that XSS on a site could cause policy to be inadvertantly applied to itself or its dependencies. We talked about this briefly last week, and I filed https://github.com/w3c/webappsec-csp/issues/126 to ponder it. :) > As a security engineer, I would consider #1 to be the real security > boundary -- a developer should use a CSP collector she trusts because > otherwise, even without script-sample, reports contain data that can > compromise the application. > That sounds like an argument for reducing the amount of data in reports, not for increasing it. I think it's somewhat rational to believe that reporting endpoints are going to have longer retention times and laxer retention policies than application databases. Data leaking from the latter into the former seems like a real risk. I agree that the URL itself already presents risks, but I don't understand how that's a justification for accepting more risk. I can easily imagine scripts that violate conditions #2 and #3, but at the > same time we have not seen many examples of such scripts so far, nor have > people complained about the script-sample data already being included by > Firefox (AFAIK). > People are generally unlikely to complain about getting more data, especially when the data's helpful and valuable. That can justify pretty much anything, though: lots of people think CORS is pretty restrictive, for instance, and probably wouldn't be sad if we relaxed it in various ways. > Overall, I don't see the gathering of script samples as qualitatively > different to the collection of URLs. However, if we are indeed particularly > worried about script snippets, we could make this opt-in and enable the > functionality only in the presence of a new keyword (report-uri /foo > 'report-script-samples') and add warnings in the spec to explain the > pitfalls. This way even if I'm wrong about all of the above we would not > expose any data from existing applications. > I suspect that such an option would simply be copy-pased into new policies, but yes, it seems like a reasonable approach. > For some background about why we're even talking about this: currently > violation reports are all but useless for both debugging and detection of > the exploitation of XSS due to the noise generated by browser extensions. > I agree that this is a problem that we should solve. One way of solving it is to add data to the reports. Another is to invest more in cleaning up the reports that you get so that there's less noise. I wish browser vendors (including Chrome) spent more time on the latter, as we're actively harming users by not doing so. -mike
Received on Tuesday, 18 October 2016 08:06:18 UTC