RE: Choosing a header compression algorithm from RUELLAN Herve on 2013-03-28 (ietf-http-wg@w3.org from January to March 2013)

From: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
Date: Thu, 28 Mar 2013 16:56:29 +0000
To: Roberto Peon <grmocg@gmail.com>
CC: "agl@google.com" <agl@google.com>, Mark Nottingham <mnot@mnot.net>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Message-ID: <6C71876BDCCD01488E70A2399529D5E5163F68F0@ADELE.crf.canon.fr>
> -----Original Message-----
> From: Roberto Peon [mailto:grmocg@gmail.com]
> Sent: mardi 26 mars 2013 19:11
> To: RUELLAN Herve
> Cc: agl@google.com; Mark Nottingham; ietf-http-wg@w3.org Group
> Subject: Re: Choosing a header compression algorithm
> 
> Awesome! I missed the =none part when looking at the code!
> 
> I think we have a difference of opinion about whether or not the two
> variations on prefix matching are safe.
> I don't think they are. :)

>From my point of view, the initial version of prefix matching is some kind of lightweight Deflate, and as so is not safe (even though it's probably safer than Deflate).
The limited prefix matching is some kind of indexing where you allow matching a whole value or some well-defined subparts of it. The limitations implied by the algorithm prevent from guessing a subpart character by character. Therefore I think this is safe.

> I can clear the compression context (in a variety of ways, one obvious way
> being to force a connection shutdown, another being to cause the
> compression context to be cleared by sending tons new data) and then I can
> cause prefix matches to occur in the "limit" strategy case, allowing the
> attacker to verify hypotheses about which characters are in whatever fields.

The sole information an attacker can get is whether he has guessed a whole subpart or not. The definition of subparts is fixed, therefore he can't take advantage of changing subparts to easily guess a value.

For example, for the targeted URL:
 	http://www.example.com/path/first/myfile

The URL:
 	http://www.example.com/path/final/otherfile
Will only leak the information that the match is: "http://www.example.com/path/".

The CRIME attack tactics is once " http://www.example.com/path/" is known to guess the next character. However, with limited prefix encoding any value with one more character will use the same shared prefix.

> In the pre-defined-terminal-character case, we can still attack various fields
> successfully, so long as they terminate in the characters you've specified.
> There is no guarantee that important data is delimited by these fields
> (though I agree it often is), and so I don't think we can state that it is
> providing a long-term security benefit without restricting the use of these
> characters in all header fields.

The pre-defined set of characters for the prefix end effectively split header values into atomic parts. An atomic part can only be attacked as a whole and so is robust to CRIME-like attack.

If your important data, "token" is included in a value as follows, with "/" being the only character in the pre-defined set of characters:
.../secret=token/...
The atomic part is "secret=token" and no partial match is allowed against it.

> 
> Splitting cookies into the crumbs is safe, since we know that is a feature of
> that specific field, and doing similar atom-based matching on other fields
> where we have a guarantee on the delimiter is likely safe, but prefix
> matching as you have it (with either strategy) can still be attacked.
> 
> -=R

Limited delta is a specific kind of atom-based matching. It is therefore no more vulnerable to attacks than other forms of atom-based matching.

Hervé.

> 
> On Tue, Mar 26, 2013 at 2:11 AM, RUELLAN Herve
> <Herve.Ruellan@crf.canon.fr> wrote:
> 
> 
> 	Roberto,
> 
> 	Prefix matching can be disabled by using the "delta=false" option.
> 
> 	There are two new strategies for prefix matching. The first one is
> limiting which shared prefixes are used. The last character of a shared prefix
> must belong to a predefined set of characters. An example of predefined set
> of characters is "/?=, " (with a space as the last character). For example, for
> the following URLs:
> 	http://www.example.com/path/first/myfile
> 	http://www.example.com/path/final/otherfile
> 	With the default strategy for prefix matching, the shared prefix
> would be:
> 	http://www.example.com/path/fi
> 	With the constraints upon the end of the shared prefix, the shared
> prefix is:
> 	http://www.example.com/path/
> 	This prevents CRIME-like attacks from guessing a value character by
> character. Therefore we think that using this new strategy, prefix matching is
> not vulnerable to CRIME-like attacks. In fact, using this strategy, we are
> enabling some kind of fine-grained indexing of values, where well defined
> parts of a value can be referred to.
> 
> 	The second strategy is to limit the number of times a given header
> value can be used as a reference for prefix matching. This limitation can be
> very low: limiting this usage as a reference to 2 times is already sufficient to
> get most of the performances of prefix matching. We think that with this
> strategy, prefix matching is mostly protected from CRIME-like attack.
> 
> 	The first strategy can be selected using the "delta_type='/?= \coma'"
> option (possibly changing the prefix ending characters).
> 	The second strategy can be selected using the "delta_type=2" option
> (or with another value).
> 
> 	I'm planning update the HeaderDiff codec to using the first strategy
> by default.
> 
> 
> 	Hervé.
> 
> 	From: Roberto Peon [mailto:grmocg@gmail.com]
> 
> 	Sent: mardi 26 mars 2013 01:23
> 	To: RUELLAN Herve; agl@google.com
> 	Cc: Mark Nottingham; ietf-http-wg@w3.org Group
> 
> 	Subject: Re: Choosing a header compression algorithm
> 
> 	Herve--
> 
> 	We need an option which disables prefix matching on the HeaderDiff
> compressor. The strategies I see in the code still allow many headers to be
> attacked (if they include commas).
> 	I believe that it is still possible to probe interesting data out of various
> fields of the URL, for example, or even cookies, assuming they aren't B64
> encoded.
> 
> 	-=R
> 
> 	On Mon, Mar 25, 2013 at 11:38 AM, Roberto Peon
> <grmocg@gmail.com> wrote:
> 	There are two obvious strategies here: What we do now, and using
> what SPDY does today (share connections if the certs match and DNS
> resolution of the new hostname overlaps with those of the current
> connection).
> 
> 	-=R
> 
> 	On Mon, Mar 25, 2013 at 10:21 AM, RUELLAN Herve
> <Herve.Ruellan@crf.canon.fr> wrote:
> 	> -----Original Message-----
> 	> From: Mark Nottingham [mailto:mnot@mnot.net]
> 	> Sent: lundi 25 mars 2013 06:56
> 	> To: RUELLAN Herve
> 	> Cc: Roberto Peon; ietf-http-wg@w3.org Group
> 	> Subject: Re: Choosing a header compression algorithm
> 	>
> 	>
> 	> On 23/03/2013, at 5:04 AM, RUELLAN Herve
> <Herve.Ruellan@crf.canon.fr>
> 	> wrote:
> 	>
> 	> > I think it would be good to move this from the compressors to the
> 	> streamifier. In addition, it would be interesting to look at a more
> realistic
> 	> streamifier that could for example unshard hosts (expecting that
> HTTP/2.0
> 	> will remove the sharding currently done by server developers).
> 	>
> 	> Right now, it combines all requests to the same TLD (according to
> the Public
> 	> Suffix List) into a single "connection." Do you have a suggestion for
> how to do
> 	> it better?
> 	I think this should provide some "realistic" results as a starting point.
> Depending on what we want to measure, we may want to refine this a bit.
> 
> 	Hervé.
> 
> 	> I've just pushed a quick and dirty fix to use a new instance of each
> 	> compressor for each connection; the results are pretty even
> between
> 	> headerdiff and delta2, with a small increase in each:
> 	>
> 	> * TOTAL: 5948 req messages
> 	>                                        size  time | ratio min   max   std
> 	>                         http1     3,460,925  0.18 | 1.00  1.00  1.00  0.00
> 	>   delta2 (max_byte_size=4096)       707,901 11.87 | 0.20  0.03  0.83
> 0.15
> 	>      headerdiff (buffer=4096)       960,106  1.65 | 0.28  0.01  0.96  0.23
> 	>
> 	> * TOTAL: 5948 res messages
> 	>                                        size  time | ratio min   max   std
> 	>                         http1     2,186,162  0.28 | 1.00  1.00  1.00  0.00
> 	>   delta2 (max_byte_size=4096)       622,837 12.86 | 0.28  0.02  1.22
> 0.13
> 	>      headerdiff (buffer=4096)       596,290  3.65 | 0.27  0.02  0.92  0.18
> 	>
> 	> Cheers,
> 	>
> 	>
> 	> --
> 	> Mark Nottingham   http://www.mnot.net/
> 	>
> 	>
> 
> 
> 
>
Received on Thursday, 28 March 2013 16:57:03 UTC