Re: CSP script hashes from Jacob Hoffman-Andrews on 2013-02-12 (public-webappsec@w3.org from February 2013)

From: Jacob Hoffman-Andrews <jsha@twitter.com>
Date: Mon, 11 Feb 2013 22:35:20 -0800
To: Bryan McQuade <bmcquade@google.com>
Cc: Yoav Weiss <yoav@yoav.ws>, Eric Chen <eric.chen@sv.cmu.edu>, Nicholas Green <ngreen@twitter.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CADzQPXtrLuA=dKzU3x5CL3wYURYonbfqT6MisE1K+1=YDq_qKA@mail.gmail.com>

On Mon, Feb 11, 2013 at 5:51 PM, Bryan McQuade <bmcquade@google.com> wrote:

> How should the hashes be expressed in the CSP header?
>
 * what hashing algorithm(s) do we want to support and do we want to allow
> servers to choose from a set of algorithms (specifying the chosen algorithm
> in the response header)?
>

Allowing server to choose is good future-proofing in case one of the hash
algorithms is broken in the future.

> * do we want to require base64-encoding of the digest or should it be up
> to the server to choose the encoding? I can't think of any good reason to
> make the encoding scheme configurable so I'd propose always using base64.
>

Agree.

>  * Should the server be allowed to choose how much of the digest to use
> depending on the security requirements of the response? The SDCH protocol,
> for example, uses a partial SHA256 as its identifier: "In communications
> between user agent and server, a dictionary is identified by the first 96
> bits of the SHA-256 digest [SHA256] of a dictionary's metadata and payload"
> (
> http://www.blogs.zeenor.com/wp-content/uploads/2011/01/Shared_Dictionary_Compression_over_HTTP.pdf
> ).
>

No. This provides implementers with too much opportunity to completely
break their security and not enough benefit. The full base64 digest of a
sha1 hash would be 29 characters (28 if we eliminate the automatic "=" at
the end. By contrast the header name "Content-Security-Policy" is 24. The
spec is not exactly parsimonious.

> My proposal for the format is: "style-hash sha1:<hash>[ sha256:<hash>];
> script-hash sha256:<hash>[ additional hashes]"
>

What about having a single inline-hash that is a digest of all allowed
inline content in the document, including both styles and scripts? The
browser would maintain a running digest as it encounters each style or
script tag. Once the digest matches the allowed inline-hash the browser
would execute the content immediately, or would report a violation upon
reaching the end of the document without ever matching the hash.

This makes it harder to deploy pages that dynamically include from multiple
sources, but keeps things simple and saves bytes.

What is the process for computing the hash on the client when validating
> the inline scripts and styles? Specifically, how do we identify the string
> of characters to compute the hash from in a non-ambiguous way? Is it
> sufficient to describe this as all content from the end of the opening
> <script>/<style> tag to the beginning of the closing tag? Is there
> something in the HTML5 or other spec that we can point at that clearly
> defines the algorithm for determining how to identify the string of
> characters that the hash should be computed from?
>

Ideally we would hash across the raw bytes from the network before they
were decoded into characters, but I think that may turn out to be
challenging at the level of the browser that is parsing out script tags. I
don't have a great suggestion here.

> Presumably, any side effects of the whitelisted scripts should be
> whitelisted as well. For instance, if a script that is whitelisted via a
> hash being included in response headers performs a document.write, the
> contents of that document.write should not have to also match a hash in the
> CSP headers.
>

You're thinking specifically of when a whitelisted script does
document.write("<script>alert("hi")</script>")? This seems like a bad idea.
Out-of-line scripts aren't allowed to write inline scripts into the DOM,
right?

Received on Tuesday, 12 February 2013 09:30:32 UTC