ACTION-147 RFC script-hash proposal v2

Hello all,

tl;dr here are two proposals. One based on rfc 6920 hash-sources, and
one that is not. Also, two identical encoding alternatives are
provided in both proposals.

Here is a second attempt at spec text for the CSP script-hash proposal
based on the original submission
http://lists.w3.org/Archives/Public/public-webappsec/2013Feb/0052.html.
While this is a diff, I'm almost positive it is not ready to merge :)

We have provided two alternatives based on whether or not the
hash-source should use RFC 6920. We queried a few people and they
recommended against supporting truncation in the case where the
attacker can define a piece of the dynamic content, making brute
forcing easier. We could also take the approach that if you're using
dynamic javascript, you're out of luck...

We also have presented two options for hashing the content of the script tags.

1) Bytes on the wire

This was brought up on the call as an option.

2) Base64(sha256(UTF-8("content to hash"))) - this was the algorithm
used to generate the example script-hashes in the document.

Since all valid javascript code falls within UTF-8, we don't think
there is a significant loss in requiring that authors place certain
data elsewhere in the DOM, in fact we think encouraging authors to
place all data outside of the inlined javascript and into other DOM
elements/external scripts is ideal.

I submit these patches fully expecting to have to make many changes :)

I will try, likely unsuccessfully, to justify some of our decisions
based on the previous concerns:

> Rather than have a script-hash directive, I would suggest using the RFC
6920 syntax (http://tools.ietf.org/html/rfc6920) as a source expression is
a good fit with the current pattern of using nonces as such.

Moved to a source-expression, with proposals based on RFC-6920 as well
as a "simplified" version.

> Specify which hash algorithms CSP 1.1 would require support for.

sha256. Smplfy.

> Specify whether and to what extent truncation is allowed.

Recommended to not allow this.

> Specify what to do with the content-type attribute of ni: URIs if we allow this to be used for non-inline content... or should this be used to determine the type (css, js, vbs, etc..) of the inline resource?

Drop it like it's hot.

> Specify an algorithm to exactly determine the bytes-to-be-hashed in a reliable and cross-browser manner.  I would suggest that this should be defined in terms of the HTML5 parsing algorithm, with some restrictions such as requiring any resource employing hash sources declare an explicit encoding. (but not just utf8)

Done, but just UTF-8 unless "bytes on the wire" ;)

> *shudder* Is canonicalization necessary?  I hope not.

None or UTF-8 only.

> Think about and determine what needs to be covered by the bytes-to-be-hashed:
> - should attributes of the script tag be included?  (e.g. whether it is javascript, vbscript, ruby or json?)

No. Only the content between the open/close tags is considered.

> Specify algorithm agility behavior
> - what to do if a policy specifies only SHA4 hashes and a user agent doesn't understand SHA4?  fail?  fallback to unsafe-inline?

If the user agent does not understand the hash algo, then the hashed
content does not match any hash-sources, so block.

 >  - possibly: if a policy specifies SHA1 and SHA3 hashes of the same
content what should user agent behavior be?  allow all as valid?  only
trust the strongest hashes it understands how to process in a given
policy string?  In the composite policy?

Only one hashing algorithm simplifies this.

If you have made it this far, I thank you for your time.

Received on Monday, 12 August 2013 20:38:58 UTC