SRI for preemptive cache validation from Jeff Kaufman on 2015-05-15 (public-webappsec@w3.org from May 2015)

From: Jeff Kaufman <jefftk@google.com>
Date: Fri, 15 May 2015 09:59:38 -0400
To: public-webappsec@w3.org
Cc: Joel Weinberger <jww@google.com>
Message-ID: <CAMJ6YUuFFu79fuCsPEjO5xfhbM6TmvfJ5W3iEDQK6ScT+9eZbA@mail.gmail.com>

A common pattern on the web is that people include some external
resource that gets updated by a third party:

    //connect.facebook.net/en_US/sdk.js
    //www.google-analytics.com/analytics.js
    //fonts.googleapis.com/css?family=Open+Sans

These are used for social buttons, analytics, ads, fonts, and content
optimization, among others.  Longcaching isn't possible here because
the people who update the external resource aren't the same people who
serve the webpages.

Additionally, while most websites could switch most of their resources to
longcaching, typically sites still use simple urls and short cache lifetimes
because longcaching (a) would require explicit effort on the part of
someone who has lots of other things to worry about and (b) isn't 100%
safe for web servers to apply automatically.

This means there are very many times when a resource is sitting stale
in the browser cache while the server knows it's still valid.  In this
case the browser has to check with the server and wait for a 304 Not
Modified before it can use the resource.

If the server could mark up resources with their expected hashes,
however, then the browser would be able to cut out the round trip in
this common case.

Current stale-but-valid:

    C: GET /
    S: ... <script src="example.js"></script> ...
    C: GET /example.js
    S: 304 not modified

Proposed stale-but-valid:

    C: GET /
    S: ... <script src="example.js"
                   integrity="nointegrity-sha256-HASH"></script> ...

Where this differs from the design goals of SRI, however, is that this
is entirely about performance, and offers no integrity assurance.
Specifically, if there is a hash mismatch the load should fall back to
an ordinary fetch:

Current stale-and-not-valid:

    C: GET /
    S: ... <script src="example.js"></script> ...
    C: GET /example.js
    S: 200 OK ... [contents]

Proposed stale-and-not-valid:

    C: GET /
    S: ... <script src="example.js"
                   integrity="nointegrity-sha256-HASH"></script> ...
    C: GET /example.js
    S: 200 OK ... [contents]

The goal is to get something that web servers can apply automatically
that can cut this round trip off the common case of stale-but-valid
cached resources.

Two questions:
* Does this make sense as part of SRI?
* Can we make it safe to apply cross-origin, ideally without checking for CORS?

Received on Monday, 18 May 2015 13:42:05 UTC