RE: Header size and policy delivery from Mike O'Neill on 2016-01-07 (public-webappsec@w3.org from January 2016)

From: Mike O'Neill <michael.oneill@baycloud.com>
Date: Thu, 7 Jan 2016 15:11:18 -0000
To: "'Patrick Toomey'" <patrick.toomey@github.com>, "'Martin Thomson'" <martin.thomson@gmail.com>, "'Jonathan Kingston'" <jonathan@jooped.co.uk>
Cc: "'WebAppSec WG'" <public-webappsec@w3.org>
Message-ID: <04c301d1495d$a97e7c00$fc7b7400$@baycloud.com>
We have an application where we use a “default” CSP that users can further restrict, i.e. to exercise their privacy rights. The site has a set of script or images it knows are OK but maybe the origin does not respect Do Not Track, then the user can opt-out of them to form a more restrictive set (by the dynamic creation of a supplementary CSP). It would save a lot of bandwidth if the default set was in a CSP manifest (that can be cached), and the header just contained the supplementary set.

 

A slightly similar situation arises in the Do Not Track Tracking Status Resource a JSON file which has a same-party array of strings property. In many situations the set of domains a company controls is very large so it would help if it didn’t have to be delivered every time, you just had to refer to the collection held elsewhere (with a hash to confirm it).

 

 

 

From: Patrick Toomey [mailto:patrick.toomey@github.com] 
Sent: 07 January 2016 05:08
To: Martin Thomson <martin.thomson@gmail.com>; Jonathan Kingston <jonathan@jooped.co.uk>
Cc: WebAppSec WG <public-webappsec@w3.org>
Subject: Re: Header size and policy delivery

 

We have started customizing our policy per endpoint and have plans to do so even more in the future. It feels like "CSP as a resource" would be a bit tricker if one customized their policy per response (maybe I missed someone already addressing this concern). If I look at our CSP policy (or the Twitter one someone showed in the original thread), the bulk of the size is taken up with various source lists. What if the cacheable CSP resource was mostly used to provide a place to collect/label/cache sets of these values. For example, rather than having to send down something like "connect-src 'self' foo.com <http://foo.com>  bar.com <http://bar.com>  foobar.com <http://foobar.com> " on each response, it could be a reference to a "source set" from the cacheable resource. So, the policy would be more like "connect-src csp-manifest-my-connect-srcs", where "my-connect-srcs" would be a labeled set of sources from the cached CSP resource. I guess there is inevitably a point where a sufficient number of CSP directives overwhelms the header, but maybe there is a way to handle that too. I haven't thought about it much, but maybe one could also use the CSP resource to collect/label/cache collections of commonly used directives too. Even though we customize our policy per response, the majority of the policy stays the same. So, there could be a reference in the CSP header response that pulls in a collection of directives that you intend to have on any page...something like "base-policy csp-manifest-my-base-policy", where "my-base-policy" would have the parts of your CSP policy that don't really change across the site. 

On Wed, Jan 6, 2016 at 9:31 PM Martin Thomson <martin.thomson@gmail.com <mailto:martin.thomson@gmail.com> > wrote:

A CSP resource sounds appealing, but I'm not sure about the latency
situation: are people OK with the notion that this is a separate
fetch?  We could use HTTP/2 server push to address the latency
problem.

On 7 January 2016 at 13:48, Jonathan Kingston <jonathan@jooped.co.uk <mailto:jonathan@jooped.co.uk> > wrote:
> Creating a new tread for discussion of a solution to header bloat size if at
> all possible.
>
> Also relevant is:
> https://lists.w3.org/Archives/Public/public-webappsec/2015Mar/0148.html
>
> Taken out from the discussion in: [CSP] "sri" source expression to enforce
> SRI
>
> On Tue, Jan 5, 2016 at 1:59 AM Nottingham, Mark <mnotting@akamai.com <mailto:mnotting@akamai.com> > wrote:
>>
>> Catching up after holidays -- I've been wanting to talk about this.
>>
>> In HTTP/2, the default of SETTINGS_HEADER_TABLE_SIZE is 4k.
>>
>> From what I've seen, Chrome and Firefox both stick with the default.
>>
>> While 4k of header compression context can help performance considerably,
>> it's important to understand that HPACK's compression scheme is
>> coarse-grained, so when the encoder is faced with a large header, it has to
>> choose between putting it into the dynamic table -- thereby denying use of
>> that space to other headers -- or repeatedly putting it out onto the wire.
>>
>> For example, Twitter's response headers already get close to this limit,
>> mostly thanks to CSP:
>> https://redbot.org/?id=w5yLyD
>>
>> Their server has to choose between putting that ~3K CSP header into the
>> dynamic table, leaving them only about 1k to play with for other headers per
>> connection, or leave it out, and send it verbatim on EVERY response. They'll
>> get small benefit from static Huffman coding (which reduces the numbers
>> above a bit), but that's it.
>>
>> If a single header value exceeds SETTINGS_HEADER_TABLE_SIZE, it can't be
>> encoded by reference, and the sender has no choice but to emit it on every
>> message.
>>
>> Things get even nastier if there are several large versions of CSP on a
>> single connection.
>>
>> Clients could start advertising a larger SETTINGS_HEADER_TABLE_SIZE, but
>> that means a larger state commitment (both client-side and server-side,
>> where it can hurt a lot more, offers more DoS exposure, etc.).
>>
>> Given that we're already seeing popular sites brush up against this,
>> PLEASE don't assume that HTTP/2 == free compression, and that we can
>> continue to merrily add headers.
>>
>> Also - when a header is both large and monolithic like CSP (i.e., it
>> doesn't allow multiple values to be combined into a comma-separated value),
>> it makes it much harder to optimise for compression, because of HPACK's
>> granularity (again). I realise that there are security motivations behind
>> this for CSP, but I wonder if the cost is justified (because once somebody
>> can append headers, there's a lot of other damage they can do).
>>
>> Cheers,
>
>
> On Tue, Jan 5, 2016 at 11:29 AM Mike O'Neill <michael.oneill@baycloud.com>
> wrote:
>>
>> I don’t know if this has already been talked about, but maybe long headers
>> like CSP can be could be put in a well-known resource. It would cost
>> another
>> roundtrip but save bandwidth in the end  because the resource would be
>> cached. The CSP header would only need to contain a hash of the resource
>> to
>> confirm
>>
>
> On Tue, Jan 5, 2016 at 11:52 AM Jonathan Kingston <jonathan@jooped.co.uk <mailto:jonathan@jooped.co.uk> >
> wrote:
>>
>> Yup Mike I had suggested the use of SRI in the header and pointing to some
>> form of manfest file.
>>
>> I think this addresses some of Marks concerns about header size however
>> creates other issues such as cache management and extra round trips.
>>
>> The advantage of the manifest also would allow separation of concerns
>> between CSP and SRI within the policy.
>>
>
> Kind regards
> Jonathan
Received on Thursday, 7 January 2016 15:12:08 UTC