Re: CSP script hashes from Bryan McQuade on 2013-02-12 (public-webappsec@w3.org from February 2013)

From: Bryan McQuade <bmcquade@google.com>
Date: Tue, 12 Feb 2013 13:48:42 -0500
To: Jacob Hoffman-Andrews <jsha@twitter.com>
Cc: Yoav Weiss <yoav@yoav.ws>, Eric Chen <eric.chen@sv.cmu.edu>, Nicholas Green <ngreen@twitter.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CADLGQyD1DFXAKDfD3qKFxtt_3K0bJ630DeijgbyGbp68Ds9Rnw@mail.gmail.com>

I definitely like the goal to reduce the size on the wire but I am worried
about how this interacts with the HTML5 parsing spec. Here's an example
that should bring more clarity:

<html>
<head>
<script>some content here</script>
... something else here ...
</head>
<body>
<script>more script here</script>
</body>
</html>

Would the CSP hash for scripts be the hash of the concatenation of "some
content here" and "more script here", e.g.:

echo "some content heremore script here" | sha1sum

If so, then for the browser to validate the scripts, it would first parse
through the first script, compute the hash, e.g.

echo "some content here" | sha1sum

and this wouldn't match, so it'd need to continue parsing but not execute
the first block. It would attempt to parse ahead until it hit the second
<script> block, at which point it'd compute the hash for the concat on the
2 blocks:

echo "some content heremore script here" | sha1sum

and this would match so the browser could execute both script blocks at
this point.

I like this but I don't think it plays well with the HTML5 spec. The HTML5
spec says that upon encountering a (non-async, non-deferred) script, the
parser itself must block until that script executes. The reason is that the
script can emit HTML through e.g. document.write and that emitted HTML must
be processed immediately after the point where the executing script block
closes. This can change the structure of the document by emitting e.g.
unbalanced tags. So it's actually not really possible to parse beyond the
first script block w/o executing it if we're following the HTML5 spec, as I
understand.

If there is a way to accommodate streaming hash evaluation using just a
single hash (as opposed to a hash per script block) that could be a good
fit for CSP. But if not, then I think we can assume that most pages will
include a small number of inline scripts/styles (as you note, inline
scripts are useful for a few things usually in the head of the document) so
a hash per script/style block is reasonable.

On Tue, Feb 12, 2013 at 1:27 PM, Jacob Hoffman-Andrews <jsha@twitter.com>wrote:

>
>  What about having a single inline-hash that is a digest of all allowed
>>> inline content in the document, including both styles and scripts? The
>>> browser would maintain a running digest as it encounters each style or
>>> script tag. Once the digest matches the allowed inline-hash the browser
>>> would execute the content immediately, or would report a violation upon
>>> reaching the end of the document without ever matching the hash.
>>>
>>
>> That would mean the browser cannot start executing inline scripts &
>> styles until the entire HTML has been downloaded (or at the very least, the
>> last inlined resource). Even for static HTMLs with multiple inlined
>> resources, this can result in a significant slowdown of the page load
>> without any benefit. (saving 30 bytes on the response headers doesn't seem
>> like a significant benefit)
>>
>
> That's why I said the browser should execute immediately once the digest
> matches the inline-hash. I believe that all uses for which inline script is
> important involve script at the top of the document. Also I believe that a
> browser won't start executing any script in a given tag until it reaches
> the end of the tag. So even if the chunk of JS that you include at the top
> of your page is very large, it won't execute any later than it normally
> would.
>

Received on Tuesday, 12 February 2013 18:49:10 UTC