FW: HPACK analysis from Rob Trace on 2014-01-27 (ietf-http-wg@w3.org from January to March 2014)

From: Rob Trace <Rob.Trace@microsoft.com>
Date: Mon, 27 Jan 2014 16:20:51 +0000
To: HTTP Working Group <ietf-http-wg@w3.org>, Magnus Nystrom <mnystrom@microsoft.com>
Message-ID: <cf3c954d25214484a6e7b672ec1986f8@BL2PR03MB372.namprd03.prod.outlook.com>
This is the start of a threat analysis on HPACK from the TLS WG chair. Please take a look at this and let me or Brian know if we need to have a threat model meeting on this to provide a response when this is posted to the mailing list.

Thanks!!

-Robt

-----Original Message-----
From: Eric Rescorla [mailto:ekr@rtfm.com] 
Sent: Sunday, January 26, 2014 5:02 PM
To: Roberto Peon; Rob Trace; mt; Patrick McManus; jpinner@twitter.com; Adam Barth; William Chan; Mark Nottingham; Sean Turner
Subject: HPACK analysis

Martin and I took a first look at the security of HPACK and produced the writeup below. We wanted to give you guys a first look before posting to the mailing list, especially since abarth pointed out that maybe there is a problem from Flash...

Any comments? Objections to us posting?

-Ekr

HPACK THREAT BACKGROUND

HPACK [0] is a compression scheme for HTTP headers that is intended to resist being used as an oracle by attackers.

The general idea behind HPACK is that each side maintains a table of known header name-value pairs. In order to transmit a set of headers, the sender encodes each member of the set in one of three ways:

- As an integer index to an existing header pair already in the
  table.
- As an integer reference to an existing header name already in
  the table plus a literal value.
- As a literal name/value pair.

An arbitary number of each header name can exist. E.g., there can be two "cookie" headers, one for each cookie.

Literal values are sent directly or encoded using Huffman coding with a fixed code table. The values are then padded out to the next byte boundary. Each header is individually encoded.

Note that the table may be (and probably is) larger than the set of headers to be encoded at any given time. I.e., the table is a (size-limited) list of every header that has been encoded but a given message may only contain some smaller set of headers.

We should analyze HPACK under two threat models, a generalized threat model and a Web-specific threat model.


GENERALIZED THREAT MODEL
We can start by analyzing the most general form of the threat model.
We assume that the attacker has an oracle O that he can query. The oracle is primed with a known set of headers where at least one of the header values V is unknown (though the attacker may know the unencoded size). The attacker's job is to extract V.

The attacker can access the oracle as follows:

- Ask for the length of the encoding of the given set of headers.
- Add a new header with an arbitrary name and value (so that if
  that header exists already, there are now multiples).
- Replace any header with a new header name and value.

The table size is infinite, so that every header name/value pair ever used is remembered.


WEB THREAT MODEL
The Web Threat model is somewhat more limited (and more complicated).

In particular:

- It is only possible to add certain headers (See the XHR and CORS
  specifications at [2] and [3] and the browser security handbook at [4]).
- The attacker does not have an unlimited number of queries.
- The table has a bounded size so the attacker needs to worry about
  pushing entries out of the table.

To get the full picture, you will need to read the HPACK, HTTP/2.0, XHR, and CORS specifications.


KNOWN ATTACKS
We have already done some preliminary analysis and know about the following attacks:

- It is possible to verify the exact value of a header if the
  attacker can inject that header. The idea here is that you
  add the header and look to see if the size of the encoded set
  increases by the literal value or by the indexed value.
  This is an advantage over simple guessing attacks against
  the server because the attacker can divert the requests so
  that the server never sees a failed guess.

- Because not all the symbols for characters are a given
  length, it is possible to learn something about a given
  value by observing its length. Unfortunately, since each
  field is separately padded, you only get to learn the
  sum of the symbol lengths rounded up to the next byte,
  which doesn't tell you much.

We do not currently know how to guess anything other than a full value, which obviously limits the utility of attacks to low-entropy values. Our best attack currently is against cookies, which can be set by the attacker. We are still looking at how to attack Basic authentication passwords. We haven't yet figured out how to get the attacker to inject them using standardized Web technologies. (The obvious avenues don't seem to work, but we are still checking.) It appears that it may, however, be possible to do so with Flash (thanks to Adam Barth for this suggestion).



[0] http://tools.ietf.org/html/draft-ietf-httpbis-header-compression-05
[1] http://tools.ietf.org/html/draft-ietf-httpbis-http2-09
[2] http://www.w3.org/TR/XMLHttpRequest/
[3] http://www.w3.org/TR/cors/
[4] https://code.google.com/p/browsersec/wiki/Main



ACKNOWLEDGEMENTS
Thanks to Martin Thomson, Patrick McManus, and Adam Barth for discussions about this.
Received on Monday, 27 January 2014 16:21:26 UTC