Re: ezflate: proposal to reinstitute deflate header compression from Roberto Peon on 2014-06-03 (ietf-http-wg@w3.org from April to June 2014)

From: Roberto Peon <grmocg@gmail.com>
Date: Tue, 3 Jun 2014 01:19:40 -0700
To: K.Morgan@iaea.org
Cc: HTTP Working Group <ietf-http-wg@w3.org>, C.Brunhuber@iaea.org
Message-ID: <CAP+FsNf_5Pi-cftWXyw+=fMCoZ-Rnu9k4yFUaswsDcx7DiiTbg@mail.gmail.com>
On Tue, Jun 3, 2014 at 1:08 AM, <K.Morgan@iaea.org> wrote:

> On 02 June 2014 21:58, grmocg@gmail.com<mailto:grmocg@gmail.com> wrote:
>
>
>
> > There are a number of issues with this (it is something I considered too
> :) ):
>
> Perhaps you didn't consider it long enough :)
>
> If you reconsider, we'd love to have your help!


> > Any encoder can use gzip, and the decoder will happily decompress it.
> > This means that the receiver does not enforce security, which would be a
> protocol flaw.
>
> We thought about this and didn't have enough time to implement it, but a
> small mod to the zlib inflate to pass metadata, to the consumer, about the
> back references would be enough to verify that the sender is conforming to
> the tokenization rules.
>

Doesn't this inflate the output? Doesn't it make in non-conformant with
flate?



>
> > One can do letter-frequency based attacks since flate (often) uses
> dynamic huffman tables. This is unsafe when unconstrained, and difficult to
> analyze.
>
> I haven't come across this attack before.  Will you please send me a link
> to some information?  I would like to include it in the "Security
> Considerations" section.
>

The idea is simple.
You send enough patterns with picked symbols, enough to get flate to send a
new dynamic huffman.
You get the secret compressed, and check the length.
This allows much more efficient probing of the compression context.



>
> > One is now compressing within atoms, unless that feature is also
> removed, in which case efficiency drops.
>
> I'm not sure what you mean by "atoms".  I assume you mean the individual
> tokens of a header value?
> If I read between the lines, your implying that this is unsafe?
>
> Consider the following header (with name: value)...
> Cookie: c1=ABCDEFGHIJKL;c2=MNOPQRSTUVWX
>
>
Any individual key or value. Cookies are crumbled, so those are atoms.


> Tokenized, the value would be split into the following tokens: {
> "c1=ABCDEFGHIJKL", ";", "c2=MNOPQRSTUVWX" }
> An attacker would have to brute-force guess the entire value of each
> cookie individually.  How is that not safe?  Assming Base64 encoding, the
> search space is 64^12=4,722,366,482,869,645,213,696
> Of course in hpack the attacker would have to guess both cookies at the
> same time which hash a search space with a number too big to include here,
> but doesn't seem to add a lot of benefit.
>

The issue is when Cookie: ABCABCABCABC is compressed against itself, if the
attacker has any ability to influence the contents of the cookie (and one
generally assumes they do).
To be safe from CRIME-style attacks, there can be no self-referential
compression within any particular atom.


>
> > Flate has no 'never compress' flag, though one could maybe be added with
> some effort.
>
> Easily added with little effort.
>

Would it then be zlib compatible?


>
> > Future uses of the protocol will likely adapt to the compression
> mechanism used, improving the efficiency of something using delta-coding
> further beyond what it looks like today.
>
> Same is true for any other compression mechanism used, including ezflate.
>

Only partially/not to the same extent.


>
> > Hpack's decoding is extremely cheap.
> > Hpack was designed to make it difficult for a conforming implementation
> to leak information, to make encoding and decoding very fast/cheap, to
> provide for receiver control over compression context size, to allow for
> proxy re-indexing (i.e. shared state between frontend and backend withing a
> proxy), and for quick comparisons of huffman-encoded strings. Several of
> these goals are difficult to achieve with something flate-based.
>
> Several of these goals could (should IMO) be achieved with the approach
> that Poul-Henning suggested [1], which is to divide the headers into
> routing headers and end-to-end headers.  The routing headers would use a
> (very simplified) hpack with a static table and static huffman coding for
> values - making it very easy/fast for proxies to achieve the goals you
> mentioned.  The end-to-end headers could use something more sophisticated
> (e.g. ezflate).
>

This would not be ezflate.

-=R
Received on Tuesday, 3 June 2014 08:20:07 UTC