h2#373 HPACK attack mitigation options from Martin Thomson on 2014-03-05 (ietf-http-wg@w3.org from January to March 2014)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Wed, 5 Mar 2014 10:23:52 +0000
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABkgnnVOh7UW7swFd+QxekFr86PC83htPfWa1kOQruspwfSNqA@mail.gmail.com>
See https://github.com/http2/http2-spec/issues/373

See also http://lists.w3.org/Archives/Public/ietf-http-wg/2014JanMar/0233.html

So, we've been discussing ways forward with this, and I think that
there is a general desire to proceed with HPACK, something we'll
probably confirm today.  But that means that (in Adam's words) we'll
have to "require[s] downstream technologies to maintain more
invariants in order to avoid leaking sensitive information."

I think that we can limit the exposure to this problem to a subset of
HTTP implementations, rather than the full community of HTTP users.
This would allow us to keep compression, which I think is an important
part of the HTTP/2 feature set.  In particular, this would allow us to
keep compression without significantly weakening it, as some people
have suggested.

Firstly, we will need to be super careful about properly identifying
the subset of implementations that might be affected by this.  I do
not want a situation where some implementers think that they can avoid
implementing mitigations for this attack, when in fact they should be.

An implementation is potentially affected by this attack if it allows
multiple actors to influence the creation of HTTP header fields on the
same connection.  It also requires that header fields provided by any
one actor be kept secret from any other actor.  In the canonical
example of a browser, the invariant we want to maintain is that any
origin (the primary class of actor in that context) is unable to
access header fields that are created by other origins, or the browser
itself.

I'll note that this is also potentially an issue for non-browsers that
use proxies.

In terms of mitigation, I have so far heard two options.  There may be
others.  I don't think that we should pick one, but instead describe
the principles underpinning both and what it would take to properly
implement them.  Then allow implementations to decide what works best
for them.

Option 1 - Origin isolation

This solution relies on identifying the actor that is producing each
header field.  Each actor can only access entries in the table that
they themselves have caused to be added.  Fields added by other actors
cannot be seen and referenced.

Browsers can use the web origin model [RFC6454] to identify actors.
Much of the mechanisms needed are already used by browsers in other
areas (image rendering, cross origin requests, script execution,
etc...).

The HTTP stack might mark certain header fields as "safe" and make
these available to all origins.  For instance, in a browser, the
entries for Accept or Accept-Encoding, which is usually fixed in
browsers, and not really secret, so exporting those for reuse by other
actors is a good idea.

I believe that this solves the problem perfectly.  However...

The first drawback is that this has worse utilization of the memory we
allocate for header compression.

When I mentioned this to Nick and Patrick, both basically recoiled in
horror at the notion of having to implement this.  This option widens
the interface to the HTTP stack and all the tracking adds complexity
to what is already the trickiest piece of HTTP/2 to get right.  I can
only imagine that this could be even harder for other implementations
to get right, especially those that don't integrate their HTTP stack
as closely as Firefox does.

Option 2 - Penalize Guessing

(This is really an idea from Adam Barth, relayed to me by Will Chan,
so I will probably miss some nuance.)

This is riskier than the other option, but also goes some way to
address the concerns with the "perfect" option.  It's easier to
implement, and likely better at compression.

The idea here is to attach a "penalty" to attempts to probe the header
table.  The penalty in this case is a loss of compression.

Variant 1 - Self Destruct Counter

Basically, each time a header field is encoded, a search is made for a
header field in the table.  If the name matches, but the value does
not, a penalty counter attached to that header field name is
increased.  A match decreases the counter, potentially by a larger
delta.

If the counter for any given header field name reaches a predefined
threshold, all entries in the header table for that name are removed
(and the counter is reset).  Hence, compression for that header field
name ceases.

Variant 2 - Stochastically Disable Referencing

Basically the above, but higher values on the counter don't trigger a
self-destruct, instead, they introduce an increasing probability that
entries in the header table are ignored even if there is a match.

Other Variants - ?

There are probably other ways to approach this that has approximately
the same properties.


Side note: Limiting SETTINGS_MAX_CONCURRENT_STREAMS

My sense is that this is going to be basically ineffectual.  Since
we're talking about a network attacker, I'm fairly certain that this
attacker can cause a browser to instantiate new connections, simply by
breaking an existing one.
Received on Wednesday, 5 March 2014 10:24:25 UTC