Re: policy-uri is slow from Aryeh Gregor on 2011-04-17 (public-web-security@w3.org from April 2011)

From: Aryeh Gregor <Simetrical+w3c@gmail.com>
Date: Sun, 17 Apr 2011 18:01:41 -0400
To: Mark Nottingham <mnot@mnot.net>
Cc: Adam Barth <w3c@adambarth.com>, public-web-security@w3.org
Message-ID: <BANLkTimCw2p8O0b7dU6mCNWgJ8P0ZGecng@mail.gmail.com>
On Fri, Apr 15, 2011 at 7:51 PM, Mark Nottingham <mnot@mnot.net> wrote:
> You seem to be saying that a site that's going through the effort of setting CSP headers for all of its resources will shy away from setting caching headers as well. Why?

Because the person setting up CSP isn't thinking about performance, or
maybe even has little to no idea what they're doing and is just
following a dodgy tutorial they found on some website.  It's not a
good idea to make assumptions about authors' competence, if it's
possible to design features that work as well and don't make such
assumptions.

> What does revalidation have to do with cache eviction?

If revalidation is required on every hit, caching is of limited
utility for small resources, because a 304 Not Modified response will
take about as much effort as resending the entire resource.

> Please re-read the thread; the idea is to use a well-known URI, which is indeed known ahead of time.

But you still don't know *if* you need to fetch it, unless you plan to
preemptively try fetching that URL for every site you visit -- which
makes no sense unless you expect the vast majority of sites to use
CSP.

>> But that sort of thing only makes sense if the resource is quite
>> large, several kilobytes at least.  If we expect the large majority of
>> CSP policies to be only a handful of bytes long, I really think
>> linking to a policy will hurt performance in nearly all cases, not
>> help it.
>
> ... and I think will help.

It will *always* hurt on a page view will cold cache, to the tune of a
full round-trip.  And that's a *lot* of views.  This research by
Yahoo! from 2007 says that 40-60% of their users have an empty cache
at least sometimes, and about 20% of all their page views have an
empty cache: <http://www.yuiblog.com/blog/2007/01/04/performance-research-part-2/>
 That's an extra round-trip on all of those page views.

By contrast, sending your CSP policy on every request will only
increase network transfer time appreciably if it happens to cause a
TCP window to fill up, so that the server has to wait for an
acknowledgement from the client before proceeding.  With current
efforts to increase default window sizes by a large margin, plus the
use of pipelining to try keeping large window sizes going, this is
going to be fairly uncommon even if the policy is a kilobyte in
length.  Surely not 25% of the time, which is what it would have to be
to outweigh a 20% empty cache rate.

So why do you think the tradeoff will typically be worth it even
*with* good caching headers, given how common it is to have a cold
cache?  With a policy-uri over regular HTTP with a cold cache, it will
take four round-trips before the user can start seeing resources
included in the page instead of three.  (One for TCP handshake, one to
start receiving the page, one to receive the policy, one to retrieve
additional content.)

> If this is *really * a concern (and I don't think it is), one could specify that the policy file has a default cacheability of the browser session (for example). HTTP allows heuristics to be used to cache responses, and this is just one example of how we could do this for CSP.

Still doesn't help the cold cache case, which is very common.

If long policies are such an issue, you could do something along the
lines of a 304 response for CSP.  So instead of policy-uri, allow a
rule called policy-id, say.  It would take an arbitrary opaque string
as its argument, which might be a number or timestamp or something in
practice.  When the browser receives such a policy, it can cache it
somewhere, remembering the site name and id.  When it next sends an
HTTP request, it can include a header like
Remembered-Content-Security-Policy: (insert id here).  When the server
sees that, it can include a special placeholder like
Content-Security-Policy: reuse-cached to instruct the browser to just
reuse the old policy, or it can include a full policy if the id is
out-of-date.

Advantages of this scheme:

* Better performance than either policy-uri or inline policies: you
send the full policy on a cold cache without an extra request (like
inline policy), and on a cache hit you reuse the cached policy (like
policy-uri).
* Does not ever add round-trips, no matter how badly you misuse it,
unlike policy-uri.

Of course, it's harder to set up than policy-uri, if you're not
serving the header from a script (in which case it's trivial).
Actually, I don't see any way to send response headers conditioned on
request headers in Apache or lighttpd.  So maybe it's not a viable
alternative . . . but I really can't see how adding a round-trip on
every cold-cache hit is ever going to be better than just including
the policy inline, unless the policy is multiple kilobytes long.  Are
we really expecting policies that long?
Received on Sunday, 17 April 2011 22:02:29 UTC