HSTS, mixed content, and priming from Richard Barnes on 2015-08-24 (public-webappsec@w3.org from August 2015)

From: Richard Barnes <rbarnes@mozilla.com>
Date: Mon, 24 Aug 2015 11:02:31 -0400
To: WebAppSec WG <public-webappsec@w3.org>
Message-ID: <CAOAcki89bttw7isA4=mdpprb4Gm0ZSUxiLZekLdNUR-cNYVoTg@mail.gmail.com>
Hey all,

This is a circle we've been around a few times, but I wanted to see if we
might be able to break out of it this time, using a new-ish idea.  The goal
of this message is to lay out the idea and an initial discussion of the
trade-offs, to see if this is an idea worth pursuing.

tl;dr: If we add priming requests for HSTS, we can allow HSTS-upgraded
reqeusts from HTTPS pages, and avoid the need for scheme changes.


# Background

In order to maximize the amount of the web that is protected with HTTPS, we
want to fetch resources using HTTPS even if the URL uses the "http:" scheme
-- but only when the "http:" and "https:" versions of the URL are
equivalent.

Currently, we have two ways of determining that HTTP and HTTPS resources
are equivalent:

  1. HSTS - equivalence asserted by the resource owner
  2. upgrade-insecure-requests - equivalence asserted by linking site

Now we have a funny asymmetry, though: Requests that are upgrade through
u-i-r are not mixed content, but those upgraded through HSTS are.  This
seems silly, given that they both result in content being loaded over
HTTPS.  In fact, HSTS is a better signal than u-i-r of when the upgrade
should be done.  How does the linking site know that the HTTP and HTTPS
URLs reference the same resource?

It seems like if we could get to the point where we could give
HSTS-upgraded resource loads the same mixed-content treatment as other
scheme-upgraded loads, it could reduce the friction of moving to HTTPS.


# HSTS Priming

The proposal here has two major parts:

1. Discover HSTS support with "priming requests":
  * When the browser encounters http://example.com/foo/bar.js on an HTTPS
page...
  * And the example.com is not an HSTS host...
  * Send a HEAD request https://example.com/ with no cookies, etc.
  * See if the query returns HSTS headers
  * If so, the browser loads https://example.com/foo/bar.js
  * ... and don't consider it mixed content
2. Do not treat HSTS-upgraded requests as mixed content

This is basically a CORS preflight, but looking for HSTS instead of ACAO.
In either case, you're taking a request that would not be allowed by the
default policy and checking to see whether it can be done in a way that is
allowed.  (Yes, you would have to do both before an upgraded CORS request.)

In past discussions of allowing HSTS-upgraded loads, there have been two
main objections:

  * Indeterminacy: Whether the upgrade happens depends on whether the
browser has encountered the HSTS header in earlier browsing.
  * Silently upgrading requests hides potential breakage

I think that adding priming addresses both of these concerns.  Priming
obviously removes indeterminacy, since if the browser doesn't know the HSTS
state of a site, it goes and checks.  As far as hiding breakage, well,
there's no breakage to hide for browsers that implement priming, and as
always, if you want to support older browsers, you need to test with them.


# Some Issues

## What value does priming add?

As mentioned above, the primary value is to remove the indeterminacy around
HSTS upgrades, so that it's safe to treat HSTS ugprades as not mixed
content.

Allowing HSTS loads also addresses some of the inherent deficiencies of
upgrade-insecure-requests:

* As noted above, it's difficult for the linking site to actually determine
whether the HTTP and HTTPS URLs actually reference the same content (except
by looking at the linked site's HSTS headers).  So there's a risk of
breakage if the linking site turns on u-i-r and the resource owner does not
maintain the equivalence.

* Priming provides a softer upgrade path than u-i-r.  Mixed content that
would not be blocked can still load, and will smoothly upgrade to HTTPS as
the resource server is able.

So relative to u-i-r, this reduces uncertainty for site operators, and gets
more HTTPS faster (since it's a partial ugprade).  It seems like these two
are complementary in much the same way that HTTPS and HSTS are -- you can
turn on HTTPS for some parts of your site, then turn on HSTS to lock it
in.  Relying on priming to upgrade what can be upgraded of your site on day
0, then once you're sure that all your sub-resources can upgrade properly,
turn on u-i-r.


## Is this something developers will understand?

Developers have gotten used to these sorts of dynamic changes to resource
loads before.  CORS obviously comes to mind, as does IPv4 / IPv6 selection.


## Is HSTS priming an expensive hack to paper over a temporary problem?

In terms of "expense": It's worth noting that HSTS priming would only be
done for potentially mixed-content requests, in cases where the HSTS state
of the remote host is unknown.  Current Firefox telemetry indicates that
around 2/25% of page loads have mixed content, which places an upper bound
on the number of additional queries.  If you load 10 pages, each of which
has 100 links to the same insecure host, you still only get one priming
query.

In terms of "hack": Any solution for upgrading things opportunistically is
going to look messy, since you need a way to probe for when you can
opportunistically upgrade.  This version seems minimally messy, since (1)
it only probes when needed, and (2) the probe relies on existing technology
(HSTS) for indicating when the upgrade is possible.

In terms of "temporary": The cost scales as the need.  As more of the web
is labeled with "https:" URIs, and as there's preloaded HSTS, there will be
no more need for priming queries, and they will not be sent.  So the "http:
and unknown HSTS" problem might be temporary, but we can see in telemetry
when it starts to disappear, and remove the feature when it's not needed.
I expect the "http: links" problem to stay around longer, possibly
indefinitely, but that doesn't require priming, just allowing HSTS upgrades.


## Is there privacy risk from the priming request?

The priming request must be HTTPS -- it's looking for HSTS, and HSTS can
only be sent over HTTPS.  So to the network, it only leaks the hostname
that the browser is considering connecting to.  To the website, we can
strip context (cookies, referer, etc.), so all the website learns is that a
given browser/IP is attempting an upgrade.
Received on Monday, 24 August 2015 15:03:06 UTC