Re: CSP script hashes from Mountie Lee on 2013-02-12 (public-webappsec@w3.org from February 2013)

From: Mountie Lee <mountie.lee@mw2.or.kr>
Date: Tue, 12 Feb 2013 16:52:02 +0900
To: "Hill, Brad" <bhill@paypal-inc.com>
Cc: Bryan McQuade <bmcquade@google.com>, Yoav Weiss <yoav@yoav.ws>, Eric Chen <eric.chen@sv.cmu.edu>, Nicholas Green <ngreen@twitter.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAE-+aYKoD-D+uQaBS74MUXPLSJYdFgrRaOj5_XjEXJDrNJtTJQ@mail.gmail.com>
Hi.

comment added inline

On Tue, Feb 12, 2013 at 4:32 PM, Hill, Brad <bhill@paypal-inc.com> wrote:

> If we proceed down this path, I'm in favor of as little complexity here as
> we can possibly manage.  A few thoughts along those lines:
>
> Canonicalization and digesting is the real hard issue here, which is, by
> virtue of being inline, closely linked to parsing and construction of the
> DOM.
>
> This was a _world of pain_ for XML Signatures.   Before we decide to
> inline this into CSP, we should carefully consider if this is a reusable
> algorithm that ought to be defined in its own spec, or as part of HTML, and
> merely referenced by CSP.
>
> There are perhaps dozens of little things to consider depending on the
> approach taken:
>
> 1) Since HTML5 is the first version to define any kind of rigorous
> parsing, tokenizing and document construction rules,  I'd suggest
> restricting this directive to being only available for HTML5 resources.
>
> http://www.w3.org/TR/2011/WD-html5-20110113/parsing.html
> http://www.w3.org/TR/2011/WD-html5-20110113/tokenization.html
> http://www.w3.org/TR/2011/WD-html5-20110113/the-end.html
>
> Even with this, the error handling and section on scripts that modify the
> page as it is being parsed are non-normative.
>
> 2) I might go so far as to suggest (unless a user-agent implementer wants
> to talk me off the ledge) that it be restricted to UTF-8 encoded documents,
> to avoid the pain and suffering of trying to construct a stable digest
> value while dealing with encoding sniffing and multi-pass parsing and any
> attacks that might result from forcing the browser to adopt a different
> encoding with a content injection in the absence of an explicitly declared
> content-encoding.
>
>
restricting UTF-8 will cause serious problems in CJK
(China-Japanese-Korean) environment.
in these regions, non-unicode encodings are widely used.
GB2312 and BIG5 for China
Shift-JIS and EUC-JP for Japan
EUC-KR for Korea



> 3) Would exempting new script elements or inline event handlers created by
> hash-whitelisted scripts, as suggested by Bryan, require changes to the
> existing unsafe-inline behavior?  Is this question relevant for
> script-nonce as well?  Would doing this safely require additional parser
> state being added to the HTML parsing algorithm?
>
> 4) Should dynamic creation of script elements that match the hash, e.g.
> with document.write(), be allowed or is the policy only evaluated on the
> first pass of the input stream preprocessor and new inline script nodes
> prohibited thereafter?
>
>
for dynamic creation of script elements,
the dynamic script is based on static.
I think focusing statis script is enough.


> 5) Do we attempt to operate on the bytes "received over the network" as
> they go through the tokenization process, or on a standard serialization
> after construction of the DOM is complete?
>
> 6) In either case, do we have reasonable expectations that the relevant
> parts of the HTML parsing algorithm will remain stable enough to make this
> a useful feature going forward?  If not, is it reasonable to expect that
> user agent maintainers will retain a forked, stable, copy of the parser to
> handle this use case as the main parser evolves?
>
> 7) What about attributes of the <script> tag itself?  (e.g. language,
> type, can change the semantics of the included text rather drastically)
>
>
this is one of good idea.
maybe
<script hash-algorithm="SHA-256" hash="afalafahfdsfadkl..." ...>

also co-exist with nonce.



> I'm quite sure there is more...
>
> -Brad
>
> From: mountie@paygate.net [mailto:mountie@paygate.net] On Behalf Of
> Mountie Lee
> Sent: Monday, February 11, 2013 9:51 PM
> To: Bryan McQuade
> Cc: Yoav Weiss; Eric Chen; Nicholas Green; public-webappsec@w3.org
> Subject: Re: CSP script hashes
>
> Hi.
> +1 for refocusing on hashes.
> nonce is just focusing to network level protection.
>
> I have added my comment line by line.
> On Tue, Feb 12, 2013 at 10:51 AM, Bryan McQuade <bmcquade@google.com>
> wrote:
> I'd like to see us move forward with the hash proposal for allowing inline
> scripts and styles.
>
> I do notice that script nonce is in the 1.1 spec as an experimental
> feature:
> https://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-specification.dev.html#script-nonce--experimentalbut in looking at this thread, there has been some concern about deploying
> nonces, especially on CDNs (some sites do use CDNs to serve their HTML with
> short-ish TTLs) and hashes are a better way to accommodate that use case.
>
> So, I'd like to propose that we refocus on hashes, and that support be
> added for both inline script and inline styles. The open questions seem to
> be:
>
> How should the hashes be expressed in the CSP header?
>
> I think hash data and object ID combination is best choice.
>
> * what hashing algorithm(s) do we want to support and do we want to allow
> servers to choose from a set of algorithms (specifying the chosen algorithm
> in the response header)?
>
> we need to consider the list of algorithms in WebCrypto API (
> http://www.w3.org/TR/WebCryptoAPI/#sha)
> SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512 are listed.
> let server choose one of them.
>
> * do we want to require base64-encoding of the digest or should it be up
> to the server to choose the encoding? I can't think of any good reason to
> make the encoding scheme configurable so I'd propose always using base64.
>
> the hash value will be different by the page encodings even with the same
> scripts.
> I prefer letting server choose the encodings.
>
> * Should the server be allowed to choose how much of the digest to use
> depending on the security requirements of the response? The SDCH protocol,
> for example, uses a partial SHA256 as its identifier: "In communications
> between user agent and server, a dictionary is identified by the first 96
> bits of the SHA-256 digest [SHA256] of a dictionary's metadata and payload"
> (
> http://www.blogs.zeenor.com/wp-content/uploads/2011/01/Shared_Dictionary_Compression_over_HTTP.pdf
> ).
>
> My proposal for the format is: "style-hash sha1:<hash>[ sha256:<hash>];
> script-hash sha256:<hash>[ additional hashes]" where the hash function used
> on the server is followed by a colon, and the hash can be variable length
> up to the full length of the base64-encoded digest. If the hash is not as
> long as the digest, then the transmitted hash should be compared against
> the first N bytes of the computed digest on the client side. This allows
> servers to transmit smaller hashes when they feel that reducing bytes on
> network is more important than transmitting full hashes for maximum
> security. We could provide recommendations for minimum transmitted hash
> length, etc.
>
>
> What is the process for computing the hash on the client when validating
> the inline scripts and styles? Specifically, how do we identify the string
> of characters to compute the hash from in a non-ambiguous way? Is it
> sufficient to describe this as all content from the end of the opening
> <script>/<style> tag to the beginning of the closing tag? Is there
> something in the HTML5 or other spec that we can point at that clearly
> defines the algorithm for determining how to identify the string of
> characters that the hash should be computed from?
>
>
> Presumably, any side effects of the whitelisted scripts should be
> whitelisted as well. For instance, if a script that is whitelisted via a
> hash being included in response headers performs a document.write, the
> contents of that document.write should not have to also match a hash in the
> CSP headers.
>
>
> I'm interested to get your thoughts on these issues and to move forward
> with adding support for inline scripts/style hashes in CSP.
>
> Thanks,
> Bryan
>
>
> On Fri, Feb 1, 2013 at 3:31 PM, Yoav Weiss <yoav@yoav.ws> wrote:
> If I understand you correctly, the scenario that scares you about hashes
> is that WebApp would white list small & potentially harmful snippets such
> as `<script>WebApp.delete_account();</script>`.
> Assuming that such a snippet has been white listed, an XSS attack can add
> such a script, resulting in an account deletion.
>
> While it is theoretically possible that some Web application may add
> potentially harmful snippets, I'm not sure that adding such snippets makes
> any sense, regardless of the potential security vulnerability with hashes.
>
>
>
> On Fri, Feb 1, 2013 at 9:02 PM, Eric Chen <eric.chen@sv.cmu.edu> wrote:
> Consider that there is an injection vector on the page and the attacker
> can inject some content. For the case of script-nonce, the attacker cannot
> inject any inline scripts, but for the case of script hash the attacker can
> inject <script>a()</script> given this is is a legitimately hashed script.
> Usually this is not be a problem but depends on what you are hashing, the
> attacker now has the ability to execute any hashed scripts in the page.
>
> Other than that I actually like the idea of using hashes, this could solve
> a lot of deployment issues.
>
> On Fri, Feb 1, 2013 at 11:45 AM, Nicholas Green <ngreen@twitter.com>
> wrote:
> Can you clarify "call a script," since scripts are tags, not methods
> they are not invokable.  Do you mean inject a script tag into the
> page?  To simply execute the script the script's contents must match a
> whitelist, so the browser will only run the script's we specify,
> unless there is some very tricky hash-collision or header injection.
> The contents of the script tag are invokable regardless of CSP
> strategy (nonces or hashes), unless the script is blocked entirely.
>
> The attacker has to already have javascript execution to invoke
> anything, at which point we've failed already, regardless of whether
> or not they choose to invoke javascript that we have written, or they
> choose to write it themselves.
>
> The goal of hash-whitelist is: Only execute the static scripts we have
> inlined and whitelisted.  I do not see the attack you are suggesting.
> FWIW I believe a hashed whitelist of scripts accomplishes the same
> thing you suggest accomplishing with <meta> tags here, but with higher
> assurance that there was no injection on the initial page load:
> http://lists.w3.org/Archives/Public/public-webappsec/2012Nov/0117.html
>
> Lastly I'm not suggesting we replace nonces, but rather add hashing as
> well.
>
> On Fri, Feb 1, 2013 at 10:40 AM, Eric Chen <eric.chen@sv.cmu.edu> wrote:
> >
> >> What we would protect agains is the invocation of dangerous methods.
> >> So is the vector you are suggesting a DOM XSS calling code provided by
> >> a hashed inline script?  That seems feasible, but that is possible
> >> with nonced scripts as well I think.  Could you elaborate? I think I'm
> >> missing something important here.
> >
> > So in the script-nonce case, each inline script must have a valid nonce
> > attribute in order to execute. But in this case, the inline script
> doesn't
> > have to have a secret attached, it just has to be on the "whitelist".
> This
> > means that the attacker can freely call any hashed script and depends on
> > what you are hashing it can be quite dangerous.
> >
> >>
> >>
> >> On Fri, Feb 1, 2013 at 10:19 AM, Eric Chen <eric.chen@sv.cmu.edu>
> wrote:
> >> > One key difference between nonces and hashes is that hashes can't stop
> >> > return-to-libc-like attacks (e.g.,, attacker calling
> >> > twitter.delete_my_account() which could be hashed).
> >> >
> >> >
> >> > On Thu, Jan 31, 2013 at 5:32 PM, Nicholas Green <ngreen@twitter.com>
> >> > wrote:
> >> >>
> >> >> Hi folks,
> >> >>
> >> >>   There has been some discussion around hashes rather than nonces for
> >> >> <script>/<style>s recently, and I wanted to support that suggestion.
> >> >> My proposal would be we send down a header of script-hashes <hash>
> >> >> <hash> ..., that specifies which scripts can run on a page.  This is,
> >> >> I think, what ISSUE-36 proposes.
> >> >>
> >> >>   The reason this is appealing to us is that the only real blockers
> >> >> that we have encountered while implementing CSP headers that restrict
> >> >> inline scripts and styles are:
> >> >>
> >> >> 1) Scripts that must be run at a certain time during page load.
> >> >> 2) Styles that should be applied from initial page load.
> >> >> 3) Scripts and styles that are inlined for performance reasons (i.e.
> >> >> to avoid an extra round trip on high latency connections).
> >> >>
> >> >>   None of these require any dynamic content to be present in the
> >> >> scripts or styles, thus script hashes, which could either complement
> >> >> or work independently of script nonces, that allowed us to specify
> the
> >> >> hashes of scripts that we will allow to run inline would be
> >> >> sufficient.  Since the content is static these hashes can be
> >> >> calculated at the deploy time (light on the server), and don't need
> to
> >> >> be salted with any server side secrets, this should be relatively
> >> >> straightforward.  Of course some details (i.e. ignore whitespace?)
> >> >> would have to be specified to ensure interoperability.  I realize
> this
> >> >> will be non-trivial to implement for some applications, but think the
> >> >> benefit is worth it.  It certainly would be from our perspective.
> >> >>
> >> >>   One last point: Since assets are often served from CDNs generating
> >> >> random nonces per request may be tricky, but if we just need to
> change
> >> >> headers each time we change assets, I think we dodge the CDN
> >> >> difficulties as well as potential caching issues.
> >> >>
> >> >>   Thoughts?  Implementation hurdles?  Other places this is already
> >> >> covered that I should've read?
> >> >>
> >> >> Thanks,
> >> >> Nick
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > -Eric
> >
> >
> >
> >
> > --
> > -Eric
>
>
>
>
> --
> -Eric
>
>
>
>
>
>
> --
> Mountie Lee
>
> PayGate
> CTO, CISSP
> Tel : +82 2 2140 2700
> E-Mail : mountie@paygate.net
> =======================================
> PayGate Inc.
> THE STANDARD FOR ONLINE PAYMENT
> for Korea, Japan, China, and the World
>
>
>
>
>
>
>
>


-- 
Mountie Lee

PayGate
CTO, CISSP
Tel : +82 2 2140 2700
E-Mail : mountie@paygate.net

=======================================
PayGate Inc.
THE STANDARD FOR ONLINE PAYMENT
for Korea, Japan, China, and the World
Received on Tuesday, 12 February 2013 07:52:49 UTC