Re: h2#373 HPACK attack mitigation options

On Wed, Mar 5, 2014 at 2:23 AM, Martin Thomson <martin.thomson@gmail.com>wrote:

> See https://github.com/http2/http2-spec/issues/373
>
> See also
> http://lists.w3.org/Archives/Public/ietf-http-wg/2014JanMar/0233.html
>
> So, we've been discussing ways forward with this, and I think that
> there is a general desire to proceed with HPACK, something we'll
> probably confirm today.  But that means that (in Adam's words) we'll
> have to "require[s] downstream technologies to maintain more
> invariants in order to avoid leaking sensitive information."
>
> I think that we can limit the exposure to this problem to a subset of
> HTTP implementations, rather than the full community of HTTP users.
> This would allow us to keep compression, which I think is an important
> part of the HTTP/2 feature set.  In particular, this would allow us to
> keep compression without significantly weakening it, as some people
> have suggested.
>
> Firstly, we will need to be super careful about properly identifying
> the subset of implementations that might be affected by this.  I do
> not want a situation where some implementers think that they can avoid
> implementing mitigations for this attack, when in fact they should be.
>
> An implementation is potentially affected by this attack if it allows
> multiple actors to influence the creation of HTTP header fields on the
> same connection.  It also requires that header fields provided by any
> one actor be kept secret from any other actor.  In the canonical
> example of a browser, the invariant we want to maintain is that any
> origin (the primary class of actor in that context) is unable to
> access header fields that are created by other origins, or the browser
> itself.
>
> I'll note that this is also potentially an issue for non-browsers that
> use proxies.
>
> In terms of mitigation, I have so far heard two options.  There may be
> others.  I don't think that we should pick one, but instead describe
> the principles underpinning both and what it would take to properly
> implement them.  Then allow implementations to decide what works best
> for them.
>
> Option 1 - Origin isolation
>
> This solution relies on identifying the actor that is producing each
> header field.  Each actor can only access entries in the table that
> they themselves have caused to be added.  Fields added by other actors
> cannot be seen and referenced.
>
> Browsers can use the web origin model [RFC6454] to identify actors.
> Much of the mechanisms needed are already used by browsers in other
> areas (image rendering, cross origin requests, script execution,
> etc...).
>
> The HTTP stack might mark certain header fields as "safe" and make
> these available to all origins.  For instance, in a browser, the
> entries for Accept or Accept-Encoding, which is usually fixed in
> browsers, and not really secret, so exporting those for reuse by other
> actors is a good idea.
>
> I believe that this solves the problem perfectly.  However...
>
> The first drawback is that this has worse utilization of the memory we
> allocate for header compression.
>
> When I mentioned this to Nick and Patrick, both basically recoiled in
> horror at the notion of having to implement this.  This option widens
> the interface to the HTTP stack and all the tracking adds complexity
> to what is already the trickiest piece of HTTP/2 to get right.  I can
> only imagine that this could be even harder for other implementations
> to get right, especially those that don't integrate their HTTP stack
> as closely as Firefox does.
>

This adds some complexity, but doesn't increase memory requirements, at
least not in non-attack cases.
Given the mechanism used for expiration of state in HPACK, this mechanism
results in a decrease in compressor efficiency mostly in the attack case,
which seems reasonable.


>
> Option 2 - Penalize Guessing
>
> (This is really an idea from Adam Barth, relayed to me by Will Chan,
> so I will probably miss some nuance.)
>
> This is riskier than the other option, but also goes some way to
> address the concerns with the "perfect" option.  It's easier to
> implement, and likely better at compression.
>
> The idea here is to attach a "penalty" to attempts to probe the header
> table.  The penalty in this case is a loss of compression.
>
> Variant 1 - Self Destruct Counter
>
> Basically, each time a header field is encoded, a search is made for a
> header field in the table.  If the name matches, but the value does
> not, a penalty counter attached to that header field name is
> increased.  A match decreases the counter, potentially by a larger
> delta.
>
> If the counter for any given header field name reaches a predefined
> threshold, all entries in the header table for that name are removed
> (and the counter is reset).  Hence, compression for that header field
> name ceases.
>
> Variant 2 - Stochastically Disable Referencing
>
> Basically the above, but higher values on the counter don't trigger a
> self-destruct, instead, they introduce an increasing probability that
> entries in the header table are ignored even if there is a match.
>
> Other Variants - ?
>
> There are probably other ways to approach this that has approximately
> the same properties.


If no state is carried over from connection-to-connection, then causing the
connection to be torn down refereshes the counter's state, and it becomes
equivalent to limiting MAX_CONCURRENT_STREAMS.
A timer on this state that outlives the connection lifetime would work to
concretely define the number of bits/second which can be probed.
I'm confident that variant 1 with this would work.
#2 is subtle and depends on more variables (how does the probability
compound, decrease, etc. Bleh.).


>
>
> Side note: Limiting SETTINGS_MAX_CONCURRENT_STREAMS
>
> My sense is that this is going to be basically ineffectual.  Since
> we're talking about a network attacker, I'm fairly certain that this
> attacker can cause a browser to instantiate new connections, simply by
> breaking an existing one.
>
>
Playing with this limits the number of attempts that can occur without the
server being informed of the evidence of the attack.
If there is a timer on reconnects, this ends up limiting the number of
bits/second which can be probed similarly to option 2 above.

There is a further option, which is "perfect" against this style of attack,
and is arguably simple. One can think of it as a variant of #1.
Option #4
Requests originating from a domain which is not the origin to which the
request is ultimately directed are disallowed to use *any* dynamic
compressor state.
-=R

Received on Wednesday, 5 March 2014 11:10:03 UTC