Re: Header size and policy delivery from Patrick Toomey on 2016-01-07 (public-webappsec@w3.org from January 2016)

From: Patrick Toomey <patrick.toomey@github.com>
Date: Thu, 07 Jan 2016 15:41:29 +0000
To: Martin Thomson <martin.thomson@gmail.com>
Cc: Jonathan Kingston <jonathan@jooped.co.uk>, WebAppSec WG <public-webappsec@w3.org>
Message-ID: <CAN4Q8dC5N1xyhuo4QRFkAVStDtWix8FzQE1ona-06QbSnGhpdA@mail.gmail.com>
I don't necessarily see the "baseline" CSP being orthogonal to "sources
lists that chew up all the bytes". Yes, we have a fair number of directives
(and growing each time directive support is added to browsers) and each of
those directives may contain a fair number of sources. However, the
baseline could encompass both aspects. For example, here is our current
policy:

default-src *; base-uri 'self'; connect-src 'self' live.github.com wss://
live.github.com uploads.github.com status.github.com api.github.com
www.google-analytics.com api.braintreegateway.com
client-analytics.braintreegateway.com github-cloud.s3.amazonaws.com;
font-src assets-cdn.github.com; form-action 'self' github.com
gist.github.com; frame-src 'self' render.githubusercontent.com
gist.github.com checkout.paypal.com; img-src 'self' data:
assets-cdn.github.com identicons.github.com www.google-analytics.com
checkout.paypal.com collector.githubapp.com *.githubusercontent.com *.
gravatar.com *.wp.com; media-src 'none'; object-src assets-cdn.github.com;
script-src assets-cdn.github.com; style-src 'self' 'unsafe-inline'
'unsafe-eval' assets-cdn.github.com

Of the above, here is the part that I'd anticipate we would leave mostly
static across pages:

default-src *; base-uri 'self'; font-src assets-cdn.github.com; script-src
assets-cdn.github.com; style-src 'self' 'unsafe-inline' 'unsafe-eval'
assets-cdn.github.com; img-src 'self' data: assets-cdn.github.com
identicons.github.com www.google-analytics.com checkout.paypal.com
collector.githubapp.com *.githubusercontent.com *.gravatar.com *.wp.com;
media-src 'none'; object-src assets-cdn.github.com; connect-src 'self'
live.github.com wss://live.github.com status.github.com api.github.com
www.google-analytics.com; frame-src 'self' render.githubusercontent.com
gist.github.com;

And here are the directives I foresee us wanting to customize in the future
based on the specific request:

connect-src
form-action

So, it would be nice to have a manifest that allows us to store/reference
the base policy (directives plus source lists) and have some means of
customizing/overriding the policy for specific pages. I was thinking of
something like this:

CSP manifest:

baseline-policy default-src *; base-uri 'self'; font-src
assets-cdn.github.com; script-src assets-cdn.github.com; style-src
csp-manifest-baseline-style-srcs; img-src csp-manifest-baseline-img-srcs;
connect-src csp-manifest-baseline-connect-srcs; frame-src
csp-manifest-baseline-connect-srcs

baseline-style-srcs 'self' 'unsafe-inline' 'unsafe-eval'
assets-cdn.github.com

baseline-img-srcs 'self' data: assets-cdn.github.com identicons.github.com
www.google-analytics.com checkout.paypal.com collector.githubapp.com *.
githubusercontent.com *.gravatar.com *.wp.com; media-src 'none'

baseline-connect-srcs 'self' live.github.com wss://live.github.com
status.github.com api.github.com www.google-analytics.com

baseline-frame-src 'self' render.githubusercontent.com gist.github.com;

Then, the CSP header in a typical non-customized response would look
something like:

Content-Security-Policy: default-policy csp-manifest-baseline-policy (maybe
some sort of hash makes sense too)

And, let's say we want to customize a specific page to allow an additional
connect-src. We could do that by overriding connect-src specifically:

Content-Security-Policy: default-policy csp-manifest-baseline-policy;
connect-src csp-manifest-baseline-connect-srcs uploads.github.com

That is what I had in my head in very broad strokes. For the bulk of our
requests that would take our CSP header size from 759 bytes to 40-50 bytes.
Assuming we customize form-action on each page, we would still only be
looking at something closer to 100 bytes. And, I anticipate that the
majority of CSP additions would be added to our baseline policy. So, the
growth of the actual header should be nominal.



On Thu, Jan 7, 2016 at 1:21 AM Martin Thomson <martin.thomson@gmail.com>
wrote:

> That might work.  How much do you think that you would benefit from a
> "baseline" CSP policy in that document?  That is, rules that were
> universal, or is it source lists that chew up all the bytes?
>
> On 7 January 2016 at 16:08, Patrick Toomey <patrick.toomey@github.com>
> wrote:
> > We have started customizing our policy per endpoint and have plans to do
> so
> > even more in the future. It feels like "CSP as a resource" would be a bit
> > tricker if one customized their policy per response (maybe I missed
> someone
> > already addressing this concern). If I look at our CSP policy (or the
> > Twitter one someone showed in the original thread), the bulk of the size
> is
> > taken up with various source lists. What if the cacheable CSP resource
> was
> > mostly used to provide a place to collect/label/cache sets of these
> values.
> > For example, rather than having to send down something like "connect-src
> > 'self' foo.com bar.com foobar.com" on each response, it could be a
> reference
> > to a "source set" from the cacheable resource. So, the policy would be
> more
> > like "connect-src csp-manifest-my-connect-srcs", where "my-connect-srcs"
> > would be a labeled set of sources from the cached CSP resource. I guess
> > there is inevitably a point where a sufficient number of CSP directives
> > overwhelms the header, but maybe there is a way to handle that too. I
> > haven't thought about it much, but maybe one could also use the CSP
> resource
> > to collect/label/cache collections of commonly used directives too. Even
> > though we customize our policy per response, the majority of the policy
> > stays the same. So, there could be a reference in the CSP header response
> > that pulls in a collection of directives that you intend to have on any
> > page...something like "base-policy csp-manifest-my-base-policy", where
> > "my-base-policy" would have the parts of your CSP policy that don't
> really
> > change across the site.
> > On Wed, Jan 6, 2016 at 9:31 PM Martin Thomson <martin.thomson@gmail.com>
> > wrote:
> >>
> >> A CSP resource sounds appealing, but I'm not sure about the latency
> >> situation: are people OK with the notion that this is a separate
> >> fetch?  We could use HTTP/2 server push to address the latency
> >> problem.
> >>
> >> On 7 January 2016 at 13:48, Jonathan Kingston <jonathan@jooped.co.uk>
> >> wrote:
> >> > Creating a new tread for discussion of a solution to header bloat size
> >> > if at
> >> > all possible.
> >> >
> >> > Also relevant is:
> >> >
> https://lists.w3.org/Archives/Public/public-webappsec/2015Mar/0148.html
> >> >
> >> > Taken out from the discussion in: [CSP] "sri" source expression to
> >> > enforce
> >> > SRI
> >> >
> >> > On Tue, Jan 5, 2016 at 1:59 AM Nottingham, Mark <mnotting@akamai.com>
> >> > wrote:
> >> >>
> >> >> Catching up after holidays -- I've been wanting to talk about this.
> >> >>
> >> >> In HTTP/2, the default of SETTINGS_HEADER_TABLE_SIZE is 4k.
> >> >>
> >> >> From what I've seen, Chrome and Firefox both stick with the default.
> >> >>
> >> >> While 4k of header compression context can help performance
> >> >> considerably,
> >> >> it's important to understand that HPACK's compression scheme is
> >> >> coarse-grained, so when the encoder is faced with a large header, it
> >> >> has to
> >> >> choose between putting it into the dynamic table -- thereby denying
> use
> >> >> of
> >> >> that space to other headers -- or repeatedly putting it out onto the
> >> >> wire.
> >> >>
> >> >> For example, Twitter's response headers already get close to this
> >> >> limit,
> >> >> mostly thanks to CSP:
> >> >> https://redbot.org/?id=w5yLyD
> >> >>
> >> >> Their server has to choose between putting that ~3K CSP header into
> the
> >> >> dynamic table, leaving them only about 1k to play with for other
> >> >> headers per
> >> >> connection, or leave it out, and send it verbatim on EVERY response.
> >> >> They'll
> >> >> get small benefit from static Huffman coding (which reduces the
> numbers
> >> >> above a bit), but that's it.
> >> >>
> >> >> If a single header value exceeds SETTINGS_HEADER_TABLE_SIZE, it can't
> >> >> be
> >> >> encoded by reference, and the sender has no choice but to emit it on
> >> >> every
> >> >> message.
> >> >>
> >> >> Things get even nastier if there are several large versions of CSP
> on a
> >> >> single connection.
> >> >>
> >> >> Clients could start advertising a larger SETTINGS_HEADER_TABLE_SIZE,
> >> >> but
> >> >> that means a larger state commitment (both client-side and
> server-side,
> >> >> where it can hurt a lot more, offers more DoS exposure, etc.).
> >> >>
> >> >> Given that we're already seeing popular sites brush up against this,
> >> >> PLEASE don't assume that HTTP/2 == free compression, and that we can
> >> >> continue to merrily add headers.
> >> >>
> >> >> Also - when a header is both large and monolithic like CSP (i.e., it
> >> >> doesn't allow multiple values to be combined into a comma-separated
> >> >> value),
> >> >> it makes it much harder to optimise for compression, because of
> HPACK's
> >> >> granularity (again). I realise that there are security motivations
> >> >> behind
> >> >> this for CSP, but I wonder if the cost is justified (because once
> >> >> somebody
> >> >> can append headers, there's a lot of other damage they can do).
> >> >>
> >> >> Cheers,
> >> >
> >> >
> >> > On Tue, Jan 5, 2016 at 11:29 AM Mike O'Neill
> >> > <michael.oneill@baycloud.com>
> >> > wrote:
> >> >>
> >> >> I don’t know if this has already been talked about, but maybe long
> >> >> headers
> >> >> like CSP can be could be put in a well-known resource. It would cost
> >> >> another
> >> >> roundtrip but save bandwidth in the end  because the resource would
> be
> >> >> cached. The CSP header would only need to contain a hash of the
> >> >> resource
> >> >> to
> >> >> confirm
> >> >>
> >> >
> >> > On Tue, Jan 5, 2016 at 11:52 AM Jonathan Kingston
> >> > <jonathan@jooped.co.uk>
> >> > wrote:
> >> >>
> >> >> Yup Mike I had suggested the use of SRI in the header and pointing to
> >> >> some
> >> >> form of manfest file.
> >> >>
> >> >> I think this addresses some of Marks concerns about header size
> however
> >> >> creates other issues such as cache management and extra round trips.
> >> >>
> >> >> The advantage of the manifest also would allow separation of concerns
> >> >> between CSP and SRI within the policy.
> >> >>
> >> >
> >> > Kind regards
> >> > Jonathan
> >>
> >
>
Received on Thursday, 7 January 2016 15:42:07 UTC