Re: comments about draft-ietf-httpbis-header-compression from Jyrki Alakuijala on 2015-01-05 (ietf-http-wg@w3.org from January to March 2015)

From: Jyrki Alakuijala <jyrki@google.com>
Date: Mon, 5 Jan 2015 02:20:08 +0100
To: Roberto Peon <grmocg@gmail.com>
Cc: Mark Nottingham <mnot@mnot.net>, Dave Garrett <davemgarrett@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAPapA7TG--dhb2qPz37H5BHxfFdu6hH015FPYAFgPtTQgk2Oew@mail.gmail.com>

On Sat, Jan 3, 2015 at 6:46 PM, Roberto Peon <grmocg@gmail.com> wrote:

> The intent was to make a compressor that was difficult to get wrong from a
> security perspective, whose implementation was reasonably easy for good
> programmers, and which did good-enough compression.
>

A safe (in the sense of CRIME) implementation of deflate can be done in
less than 1000 lines of code, possibly 500 lines. It is much easier to
write than a full zlib implementation. Most likely also easier than HPACK
encoder + decoder.

>
> Your statement about zlib being 'as safe' misses the mark. zlib has more
> capabilities, which include things which are known to be unsafe.
>

Yes. This is why I am promoting a new implementation of the encoder.

> More capabilities most usually means less safe.
>

This is also my concern. The more code you add into the browser, the more
attack surface there is. The deflate decoder is likely relatively safe, and
the same deflate decoder could be used in the future together with a safe
encoder. If you add a new decoder for HPACK into the browser, you will be
increasing the attack surface.

> Adding bits in the manner you suggested doesn't work-- it requires the
> attacker to do more requests to determine if what it did was right and this
> is linear, not exponential.
>

Why not? My intuition is the opposite. Has anyone done the statistics on
this? If there is randomization of the payload size, one will need to do
statistically sufficient testing of all options. In practice this means
that the attacker has to guess about half of the random payload size before
he can make progress. If we add 0-128 bits of random payload by hashing the
other content, the attacker has to do more than 2**64 attempts to get one
byte out. It is definitely going to be exponential with the number of bits.
The attacker can only make progress when he guesses the correct bytes, but
if the length is modulated by a complex hashing function of the data, the
length no longer tells anything unless the change is more than the
variation from the hashing.

> Even if that wasn't true, you're adding bits (and a fair number of them),
> which defeats the purpose of compression.
>

Yes, but that would be adding bits to a good compression result, instead of
giving up with compression in the first place. If adding bits works (like
it very likely does), then we could use algorithms like brotli for this.

Non-static entropy coding also leads to non-exponential searches of the
> input space if the attacker is allowed to influence the entropy coding.
> That is why HPACK doesn't do non-static entropy coding. It uses a canonical
> huffman format so that it would be possible to do "non static", though I
> envisioned that this would only happen before request bits were sent, e.g.
> in the ALPN token.
>

A specialized deflate encoder for this purpose can use a static code that
is the best fit for the website. The code being static can be a property of
the encoder -- it does not have to be static in the format itself.

HPACK offers a means of not doing entropy coding, so if it gets out of
> date, either the dictionary gets rev'd (e.g. at startup as described
> above), or one chooses to not use it. This is described in section 5.2.
>

Deflate supports these, too. You can define either static entropy code, or
even an uncompressed code.

Received on Monday, 5 January 2015 01:20:35 UTC