Re: CSP reports: `script-sample` from Artur Janc on 2016-10-17 (public-webappsec@w3.org from October 2016)

From: Artur Janc <aaj@google.com>
Date: Tue, 18 Oct 2016 01:03:25 +0200
To: Devdatta Akhawe <dev.akhawe@gmail.com>
Cc: Mike West <mkwst@google.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>, Christoph Kerschbaumer <ckerschbaumer@mozilla.com>, Frederik Braun <fbraun@mozilla.com>, Scott Helme <scotthelme@hotmail.com>, Lukas Weichselbaum <lwe@google.com>, Michele Spagnuolo <mikispag@google.com>, Jochen Eisinger <eisinger@google.com>
Message-ID: <CAPYVjqocBc97b1sCde9V1VTfzD_LbSaG2GAkvHkDjATUhfmy2g@mail.gmail.com>
On Mon, Oct 17, 2016 at 7:15 PM, Devdatta Akhawe <dev.akhawe@gmail.com>
wrote:

> Hey
>
> In the case of a third-party script having an error, what are example
> leaks you are worried about?
>
> cheers
> Dev
>
> On 17 October 2016 at 07:37, Mike West <mkwst@google.com> wrote:
>
>> +Lukas, Miki, and Jochen, who I intended to add but didn't.
>>
>> -mike
>>
>> On Mon, Oct 17, 2016 at 4:15 PM, Mike West <mkwst@google.com> wrote:
>>
>>> Hello, webappsec!
>>>
>>> Last week, Artur, Christoph, Freddy, and I had a conversation about
>>> reporting more detail for script-based violations. In particular, Artur and
>>> his team at Google would love for Firefox's `script-sample` report field to
>>> be implemented in Chrome (and extended in Firefox) to help them diagnose
>>> inline event handlers, scripts, etc. The conversation was a bit
>>> contentious, so bringing it to the list might be interesting, as I expect
>>> folks to have opinions. I have a few myself.
>>>
>>> I think there was general agreement on two points:
>>>
>>> 1. It's totally reasonable to provide developers with more information
>>> about violations on their own pages. They can already crawl the DOM via
>>> JavaScript, so the data contained in `<script>` and inline `onX` handlers
>>> is something they could potentially hack their way to reporting via
>>> 'SecurityPolicyViolation' events (by reporting everything on the page, if
>>> nothing else).
>>>
>>> 2. It's not reasonable to provide developers with details of third-party
>>> script, unless that external script has opted into sharing details via
>>> CORS. Though Firefox correctly sanitizes script errors per spec, it doesn't
>>> appear to sanitize `script-sample`; that continues to worry me.
>>>
>>> There was less agreement on this third point:
>>>
>>> 3. I'm worried about sending the contents of inline script blocks to a
>>> third-party, as those blocks seem to me to be the most likely to contain
>>> sensitive information (`var email = 'mike@mikewest.org';`, `var ssn =
>>> '123-45-6789';`, etc). I'm worried about this for two reasons: a) it might
>>> enable abuse, especially in the context of embedded enforcement, and b)
>>> reports seem by their nature to be delivered to logging servers which
>>> probably don't have the same kinds of data retention policies (or
>>> understanding of data sensitivity) as other systems.
>>>
>>> Artur, and others, suggested that this isn't much of a concern because
>>> a) the data is already available to script, and b) any sensitivity is
>>> mitigated by manipulating the data somehow (stripping to ~40 characters
>>> like Firefox does, stripping constants, etc.).
>>>
>>
>>> With those in mind, I'm thinking:
>>>
>>> 1. Firefox should reexamine it's behavior for cross-origin script
>>> (perhaps it already has?).
>>>
>>> 2. Inline event handlers are probably fine to report; they're unlikely
>>> to contain sensitive data, are already available to JavaScript that crawls
>>> the DOM, etc.
>>>
>>> 3. Inline scripts worry me, but maybe they shouldn't worry me as much as
>>> they do. It's not clear to me what kind of metric we could use to make a
>>> real decision here, but anecdata from Google's deployment shows that the
>>> `script-sample` data gathered from Firefox hasn't been sensitive. It's
>>> unclear how well that applies to the rest of the web (or, indeed, to
>>> Google's sites that aren't yet using CSP). Perhaps Mozilla folks did some
>>> research when implementing this feature that justify/explain the 40
>>> character limit as sufficiently-safe?
>>>
>>
Thanks for the summary, Mike! It's a good overview of the issue, but I'd
like to expand on the reasoning for why including the prefix of an inline
script doesn't sound particularly scary to me.

Basically, in order for this to be a concern, all of the following
conditions need to be met:

1. The application has to use untrusted report collection infrastructure.
If that is the case, the application is already leaking sensitive data from
page/referrer URLs to its collector. In fact, I'd be much more worried
about URLs than script prefixes, because URLs leak on *any* violation (not
just for script-src) and URLs frequently contain PII or
authorization/capability-bearing tokens e.g for password reset
functionality.

2. The application needs to have a script which includes sensitive user
data somewhere in the first N characters. FWIW in our small-scale analysis
of a few hundred thousand reports we saw ~300 inline script samples sent by
Firefox (with N=40) and haven't found sensitive tokens in any of the
snippets.

3. The offending script needs to cause a CSP violation, i.e. not have a
valid nonce, meaning that the application is likely broken if the policy is
in enforcing mode.

As a security engineer, I would consider #1 to be the real security
boundary -- a developer should use a CSP collector she trusts because
otherwise, even without script-sample, reports contain data that can
compromise the application. I can easily imagine scripts that violate
conditions #2 and #3, but at the same time we have not seen many examples
of such scripts so far, nor have people complained about the script-sample
data already being included by Firefox (AFAIK).

Overall, I don't see the gathering of script samples as qualitatively
different to the collection of URLs. However, if we are indeed particularly
worried about script snippets, we could make this opt-in and enable the
functionality only in the presence of a new keyword (report-uri /foo
'report-script-samples') and add warnings in the spec to explain the
pitfalls. This way even if I'm wrong about all of the above we would not
expose any data from existing applications.

For some background about why we're even talking about this: currently
violation reports are all but useless for both debugging and detection of
the exploitation of XSS due to the noise generated by browser extensions.
Popular applications get hundreds of thousands of reports per day even if
they don't have any markup which violates their policy. Gathering
information about the contents of inline scripts and event handlers causing
a violation would let developers distinguish between extension-injected
markup and legitimate parts of the application which are broken by CSP,
allowing reports to finally be useful to developers.

Cheers,
-Artur
Received on Monday, 17 October 2016 23:04:14 UTC