Re: CSP script hashes

I'd like to see us move forward with the hash proposal for allowing inline
scripts and styles.

I do notice that script nonce is in the 1.1 spec as an experimental
feature:
https://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html#script-nonce--experimentalbut
in looking at this thread, there has been some concern about deploying
nonces, especially on CDNs (some sites do use CDNs to serve their HTML with
short-ish TTLs) and hashes are a better way to accommodate that use case.

So, I'd like to propose that we refocus on hashes, and that support be
added for both inline script and inline styles. The open questions seem to
be:

How should the hashes be expressed in the CSP header?
* what hashing algorithm(s) do we want to support and do we want to allow
servers to choose from a set of algorithms (specifying the chosen algorithm
in the response header)?
* do we want to require base64-encoding of the digest or should it be up to
the server to choose the encoding? I can't think of any good reason to make
the encoding scheme configurable so I'd propose always using base64.
* Should the server be allowed to choose how much of the digest to use
depending on the security requirements of the response? The SDCH protocol,
for example, uses a partial SHA256 as its identifier: "In communications
between user agent and server, a dictionary is identified by the first 96
bits of the SHA-256 digest [SHA256] of a dictionary's metadata and payload"
(
http://www.blogs.zeenor.com/wp-content/uploads/2011/01/Shared_Dictionary_Compression_over_HTTP.pdf
).

My proposal for the format is: "style-hash sha1:<hash>[ sha256:<hash>];
script-hash sha256:<hash>[ additional hashes]" where the hash function used
on the server is followed by a colon, and the hash can be variable length
up to the full length of the base64-encoded digest. If the hash is not as
long as the digest, then the transmitted hash should be compared against
the first N bytes of the computed digest on the client side. This allows
servers to transmit smaller hashes when they feel that reducing bytes on
network is more important than transmitting full hashes for maximum
security. We could provide recommendations for minimum transmitted hash
length, etc.


What is the process for computing the hash on the client when validating
the inline scripts and styles? Specifically, how do we identify the string
of characters to compute the hash from in a non-ambiguous way? Is it
sufficient to describe this as all content from the end of the opening
<script>/<style> tag to the beginning of the closing tag? Is there
something in the HTML5 or other spec that we can point at that clearly
defines the algorithm for determining how to identify the string of
characters that the hash should be computed from?


Presumably, any side effects of the whitelisted scripts should be
whitelisted as well. For instance, if a script that is whitelisted via a
hash being included in response headers performs a document.write, the
contents of that document.write should not have to also match a hash in the
CSP headers.


I'm interested to get your thoughts on these issues and to move forward
with adding support for inline scripts/style hashes in CSP.

Thanks,
Bryan



On Fri, Feb 1, 2013 at 3:31 PM, Yoav Weiss <yoav@yoav.ws> wrote:

> If I understand you correctly, the scenario that scares you about hashes
> is that WebApp would white list small & potentially harmful snippets such
> as `<script>WebApp.delete_account();</script>`.
> Assuming that such a snippet has been white listed, an XSS attack can add
> such a script, resulting in an account deletion.
>
> While it is theoretically possible that some Web application may add
> potentially harmful snippets, I'm not sure that adding such snippets makes
> any sense, regardless of the potential security vulnerability with hashes.
>
>
>
>
> On Fri, Feb 1, 2013 at 9:02 PM, Eric Chen <eric.chen@sv.cmu.edu> wrote:
>
>> Consider that there is an injection vector on the page and the attacker
>> can inject some content. For the case of script-nonce, the attacker cannot
>> inject any inline scripts, but for the case of script hash the attacker can
>> inject <script>a()</script> given this is is a legitimately hashed script.
>> Usually this is not be a problem but depends on what you are hashing, the
>> attacker now has the ability to execute any hashed scripts in the page.
>>
>> Other than that I actually like the idea of using hashes, this could
>> solve a lot of deployment issues.
>>
>>
>> On Fri, Feb 1, 2013 at 11:45 AM, Nicholas Green <ngreen@twitter.com>wrote:
>>
>>> Can you clarify "call a script," since scripts are tags, not methods
>>> they are not invokable.  Do you mean inject a script tag into the
>>> page?  To simply execute the script the script's contents must match a
>>> whitelist, so the browser will only run the script's we specify,
>>> unless there is some very tricky hash-collision or header injection.
>>> The contents of the script tag are invokable regardless of CSP
>>> strategy (nonces or hashes), unless the script is blocked entirely.
>>>
>>> The attacker has to already have javascript execution to invoke
>>> anything, at which point we've failed already, regardless of whether
>>> or not they choose to invoke javascript that we have written, or they
>>> choose to write it themselves.
>>>
>>> The goal of hash-whitelist is: Only execute the static scripts we have
>>> inlined and whitelisted.  I do not see the attack you are suggesting.
>>> FWIW I believe a hashed whitelist of scripts accomplishes the same
>>> thing you suggest accomplishing with <meta> tags here, but with higher
>>> assurance that there was no injection on the initial page load:
>>> http://lists.w3.org/Archives/Public/public-webappsec/2012Nov/0117.html
>>>
>>> Lastly I'm not suggesting we replace nonces, but rather add hashing as
>>> well.
>>>
>>> On Fri, Feb 1, 2013 at 10:40 AM, Eric Chen <eric.chen@sv.cmu.edu> wrote:
>>> >
>>> >> What we would protect agains is the invocation of dangerous methods.
>>> >> So is the vector you are suggesting a DOM XSS calling code provided by
>>> >> a hashed inline script?  That seems feasible, but that is possible
>>> >> with nonced scripts as well I think.  Could you elaborate? I think I'm
>>> >> missing something important here.
>>> >
>>> > So in the script-nonce case, each inline script must have a valid nonce
>>> > attribute in order to execute. But in this case, the inline script
>>> doesn't
>>> > have to have a secret attached, it just has to be on the "whitelist".
>>> This
>>> > means that the attacker can freely call any hashed script and depends
>>> on
>>> > what you are hashing it can be quite dangerous.
>>> >
>>> >>
>>> >>
>>> >> On Fri, Feb 1, 2013 at 10:19 AM, Eric Chen <eric.chen@sv.cmu.edu>
>>> wrote:
>>> >> > One key difference between nonces and hashes is that hashes can't
>>> stop
>>> >> > return-to-libc-like attacks (e.g.,, attacker calling
>>> >> > twitter.delete_my_account() which could be hashed).
>>> >> >
>>> >> >
>>> >> > On Thu, Jan 31, 2013 at 5:32 PM, Nicholas Green <ngreen@twitter.com
>>> >
>>> >> > wrote:
>>> >> >>
>>> >> >> Hi folks,
>>> >> >>
>>> >> >>   There has been some discussion around hashes rather than nonces
>>> for
>>> >> >> <script>/<style>s recently, and I wanted to support that
>>> suggestion.
>>> >> >> My proposal would be we send down a header of script-hashes <hash>
>>> >> >> <hash> ..., that specifies which scripts can run on a page.  This
>>> is,
>>> >> >> I think, what ISSUE-36 proposes.
>>> >> >>
>>> >> >>   The reason this is appealing to us is that the only real blockers
>>> >> >> that we have encountered while implementing CSP headers that
>>> restrict
>>> >> >> inline scripts and styles are:
>>> >> >>
>>> >> >> 1) Scripts that must be run at a certain time during page load.
>>> >> >> 2) Styles that should be applied from initial page load.
>>> >> >> 3) Scripts and styles that are inlined for performance reasons
>>> (i.e.
>>> >> >> to avoid an extra round trip on high latency connections).
>>> >> >>
>>> >> >>   None of these require any dynamic content to be present in the
>>> >> >> scripts or styles, thus script hashes, which could either
>>> complement
>>> >> >> or work independently of script nonces, that allowed us to specify
>>> the
>>> >> >> hashes of scripts that we will allow to run inline would be
>>> >> >> sufficient.  Since the content is static these hashes can be
>>> >> >> calculated at the deploy time (light on the server), and don't
>>> need to
>>> >> >> be salted with any server side secrets, this should be relatively
>>> >> >> straightforward.  Of course some details (i.e. ignore whitespace?)
>>> >> >> would have to be specified to ensure interoperability.  I realize
>>> this
>>> >> >> will be non-trivial to implement for some applications, but think
>>> the
>>> >> >> benefit is worth it.  It certainly would be from our perspective.
>>> >> >>
>>> >> >>   One last point: Since assets are often served from CDNs
>>> generating
>>> >> >> random nonces per request may be tricky, but if we just need to
>>> change
>>> >> >> headers each time we change assets, I think we dodge the CDN
>>> >> >> difficulties as well as potential caching issues.
>>> >> >>
>>> >> >>   Thoughts?  Implementation hurdles?  Other places this is already
>>> >> >> covered that I should've read?
>>> >> >>
>>> >> >> Thanks,
>>> >> >> Nick
>>> >> >>
>>> >> >>
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > -Eric
>>>
>>
>>
>>
>> --
>> -Eric
>>
>
>

Received on Tuesday, 12 February 2013 01:52:26 UTC