Re: ezflate: proposal to reinstitute deflate header compression from Roberto Peon on 2014-06-02 (ietf-http-wg@w3.org from April to June 2014)

From: Roberto Peon <grmocg@gmail.com>
Date: Mon, 2 Jun 2014 12:58:42 -0700
To: K.Morgan@iaea.org
Cc: HTTP Working Group <ietf-http-wg@w3.org>, C.Brunhuber@iaea.org
Message-ID: <CAP+FsNcbr2iyc8EPOBr9eZ_C4vqnFmNe25TUARZY9OfC7ittFg@mail.gmail.com>
There are a number of issues with this (it is something I considered too :)
):

Any encoder can use gzip, and the decoder will happily decompress it. This
means that the receiver does not enforce security, which would be a
protocol flaw.

One can do letter-frequency based attacks since flate (often) uses dynamic
huffman tables. This is unsafe when unconstrained, and difficult to analyze.

One is now compressing within atoms, unless that feature is also removed,
in which case efficiency drops.

Flate has no 'never compress' flag, though one could maybe be added with
some effort.

Future uses of the protocol will likely adapt to the compression mechanism
used, improving the efficiency of something using delta-coding further
beyond what it looks like today.

Hpack's decoding is extremely cheap.


Hpack was designed to make it difficult for a conforming implementation to
leak information, to make encoding and decoding very fast/cheap, to provide
for receiver control over compression context size, to allow for proxy
re-indexing (i.e. shared state between frontend and backend withing a
proxy), and for quick comparisons of huffman-encoded strings. Several of
these goals are difficult to achieve with something flate-based.

-=R


On Mon, Jun 2, 2014 at 12:26 PM, <K.Morgan@iaea.org> wrote:

>  Sorry, right after I clicked send, I got the links to the “htmlized”
> versions which are probably easier to read…
>
>
>
> A new version of I-D, draft-morgan-ezflate-00.txt has been successfully
> submitted by Keith Shearl Morgan and posted to the IETF repository.
>
>
>
> Name:                  draft-morgan-ezflate
>
> Revision:              00
>
> Title:                     EZFLATE: Token-based DEFLATE Compression
>
> Document date: 2014-06-02
>
> Group:                  Individual Submission
>
> Pages:                  10
>
> URL:
> http://www.ietf.org/internet-drafts/draft-morgan-ezflate-00.txt
>
> Status:         https://datatracker.ietf.org/doc/draft-morgan-ezflate/
>
> Htmlized:       http://tools.ietf.org/html/draft-morgan-ezflate-00
>
>
>
> A new version of I-D, draft-morgan-http2-header-compression-00.txt
>
> has been successfully submitted by Keith Shearl Morgan and posted to the
> IETF repository.
>
>
>
> Name:                  draft-morgan-http2-header-compression
>
> Revision:              00
>
> Title:                     H2EZ: HTTP/2 Header Compression
>
> Document date: 2014-06-02
>
> Group:                  Individual Submission
>
> Pages:                  11
>
> URL:
> http://www.ietf.org/internet-drafts/draft-morgan-http2-header-compression-00.txt
>
> Status:
> https://datatracker.ietf.org/doc/draft-morgan-http2-header-compression/
>
> Htmlized:
> http://tools.ietf.org/html/draft-morgan-http2-header-compression-00
>
>
>
>
>
> On Monday,02 June 2014 21:23, MORGAN, Keith Shearl wrote:
>
>
>
> No, we haven't been living under a rock.  Yes, we've heard of CRIME.
>
>
>
> CRIME exploited the deflate *algorithm* as described in RFC 1951, *not*
> the deflate *format*. What makes the RFC 1951 *algorithm* vulnerable to
> the CRIME attack is that it does character by character matching (normally
> this is a good thing because it maximizes the number and length of
> matches!).
>
>
>
> The first paragraph of Section 4 of RFC 1951 says that “[implementations]
> need not follow [the general algorithm presented here] in order to be
> compliant.”
>
> We propose an alternate deflate algorithm called ezflate (“easy flate”); a
> token-based deflate algorithm.  The key feature of the ezflate algorithm
> (and primary difference to the RFC 1951 algorithm), is that it does
> token-by-token matching.
>
>
>
> Here is a simple example, to illustrate how it works.  Consider the
> following HTTP/2 requests (abbreviated):
>
>
>
> :method: GET
>
> :path: /
>
> accept-encoding: gzip
>
>
>
> :method: GET
>
> :path: /cool.html
>
> accept-encoding: gzip, sdch
>
>
>
> The tokens “:method”, “GET”, “:path”, “accept-encoding” and “gzip” are
> compressed in the second request. The tokens “/cool.html” and “sdch” are
> not compressed.
>
>
>
> Now consider another pair of requests (abbreviated):
>
>
>
> :method: GET
>
> :path: /awesome.html
>
> secret: ABCDEFGH
>
>
>
> :method: GET
>
> :path: /?guess=ABC
>
> secret: ABCDEFGH
>
>
>
> The tokens “:method”, “GET”, “:path”, “secret” and “ABCDEFGH” are
> compressed in the second request.  But more importantly, the attacker’s
> guess of the secret “ABC” does not trigger a match and is impervious to a
> CRIME-like attack.  (Of course, just like hpack, ezflate is vulnerable to a
> brute-force attack of all 8-character combinations.)
>
>
>
> The ezflate algorithm is non-http/2 specific so we divided our Internet
> Drafts into two parts (the ezflate algorithm itself and the http/2 header
> compression based on ezflate):
>
> http://www.ietf.org/id/draft-morgan-ezflate-00.txt
>
> http://www.ietf.org/id/draft-morgan-http2-header-compression-00.txt
>
>
>
> We have implemented ezflate as an additional “compression strategy” called
> Z_EZFLATE in the zlib library.
>
> The code is very straightforward. We basically only had to implement one
> new compression function (in addition to the existing deflate_fast &
> deflate_slow, we added ezflate).  We relied on the rest of the proven zlib
> infrastructure for windowing, hash chains, Huffman coding, etc.
>
>
>
> We used the http2/http_samples [1] har files as test vectors for analysing
> the compression factor of our ezflate implementation. Here are the data:
>
>
>
> pt=plain-text; df=deflate; hp=hpack*; ez=ezflate**
>
>
>
> Method  avg-bytes/req (% of pt) avg-bytes/rsp (% of pt)
>
> pt 574 (n/a) 407 (n/a)
>
> df  70 (12%)  62 (15%)
>
> hp  91 (16%)  83 (20%)
>
> ez  99 (17%)  77 (19%)
>
>
>
> For the data set, hpack is slightly better on requests and ezflate is
> slightly better on responses, but the difference is negligible and likely
> depends on the use cases. (Note that requests to multiple hosts were
> interleaved as if they were all sent to a single proxy.) So changing to
> ezflate just based on the compression factor is not compelling…
>
>
>
> …BUT, there are other benefits of ezflate beyond the raw compression
> factor...
>
> + Interoperability is easy; *any inflate library (e.g. zlib) will
> decompress ezflate streams*
>
> + Built on the proven zlib library (our implementation)
>
> + Easy to implement a custom compressor i.e. less complexity
>
> + Custom resource-constrained implementations are possible by trading off
> compression for resources
>
> + Furthermore, resource-constrained implementations can use non-compressed
> header blocks or Huffman-only header blocks
>
> + Allows interleaving of HEADERS frames with other frame types
>
> + Streaming
>
> + Useful for any tokenizable data stream (e.g. html) which contains secret
> and attacker-controlled data (think BREACH)
>
>
>
> *hpack was configured with a table-size=4096, the hpack results come from
> nghttp2 deflatehd
>
> **ezflate was configured with window-bits=12 (4096 window), mem-level=6
> (default), level=6 (default -> level is a non-factor for ezflate)
>
>
>
> Regards,
>
> Keith & Chris
>
>
>
> [1] https://github.com/http2/http_samples
>
>
>
>
>
> This email message is intended only for the use of the named recipient.
> Information contained in this email message and its attachments may be
> privileged, confidential and protected from disclosure. If you are not the
> intended recipient, please do not read, copy, use or disclose this
> communication to others. Also please notify the sender by replying to this
> message and then delete it from your system.
>
Received on Monday, 2 June 2014 19:59:10 UTC