ezflate: proposal to reinstitute deflate header compression

No, we haven't been living under a rock.  Yes, we've heard of CRIME.



CRIME exploited the deflate algorithm as described in RFC 1951, not the deflate format. What makes the RFC 1951 algorithm vulnerable to the CRIME attack is that it does character by character matching (normally this is a good thing because it maximizes the number and length of matches!).



The first paragraph of Section 4 of RFC 1951 says that "[implementations] need not follow [the general algorithm presented here] in order to be compliant."

We propose an alternate deflate algorithm called ezflate ("easy flate"); a token-based deflate algorithm.  The key feature of the ezflate algorithm (and primary difference to the RFC 1951 algorithm), is that it does token-by-token matching.



Here is a simple example, to illustrate how it works.  Consider the following HTTP/2 requests (abbreviated):



:method: GET

:path: /

accept-encoding: gzip



:method: GET

:path: /cool.html

accept-encoding: gzip, sdch



The tokens ":method", "GET", ":path", "accept-encoding" and "gzip" are compressed in the second request. The tokens "/cool.html" and "sdch" are not compressed.



Now consider another pair of requests (abbreviated):



:method: GET

:path: /awesome.html

secret: ABCDEFGH



:method: GET

:path: /?guess=ABC

secret: ABCDEFGH



The tokens ":method", "GET", ":path", "secret" and "ABCDEFGH" are compressed in the second request.  But more importantly, the attacker's guess of the secret "ABC" does not trigger a match and is impervious to a CRIME-like attack.  (Of course, just like hpack, ezflate is vulnerable to a brute-force attack of all 8-character combinations.)



The ezflate algorithm is non-http/2 specific so we divided our Internet Drafts into two parts (the ezflate algorithm itself and the http/2 header compression based on ezflate):

http://www.ietf.org/id/draft-morgan-ezflate-00.txt

http://www.ietf.org/id/draft-morgan-http2-header-compression-00.txt



We have implemented ezflate as an additional "compression strategy" called Z_EZFLATE in the zlib library.

The code is very straightforward. We basically only had to implement one new compression function (in addition to the existing deflate_fast & deflate_slow, we added ezflate).  We relied on the rest of the proven zlib infrastructure for windowing, hash chains, Huffman coding, etc.



We used the http2/http_samples [1] har files as test vectors for analysing the compression factor of our ezflate implementation. Here are the data:



pt=plain-text; df=deflate; hp=hpack*; ez=ezflate**



Method  avg-bytes/req (% of pt) avg-bytes/rsp (% of pt)

pt 574 (n/a) 407 (n/a)

df  70 (12%)  62 (15%)

hp  91 (16%)  83 (20%)

ez  99 (17%)  77 (19%)



For the data set, hpack is slightly better on requests and ezflate is slightly better on responses, but the difference is negligible and likely depends on the use cases. (Note that requests to multiple hosts were interleaved as if they were all sent to a single proxy.) So changing to ezflate just based on the compression factor is not compelling...



...BUT, there are other benefits of ezflate beyond the raw compression factor...

+ Interoperability is easy; any inflate library (e.g. zlib) will decompress ezflate streams

+ Built on the proven zlib library (our implementation)

+ Easy to implement a custom compressor i.e. less complexity

+ Custom resource-constrained implementations are possible by trading off compression for resources

+ Furthermore, resource-constrained implementations can use non-compressed header blocks or Huffman-only header blocks

+ Allows interleaving of HEADERS frames with other frame types

+ Streaming

+ Useful for any tokenizable data stream (e.g. html) which contains secret and attacker-controlled data (think BREACH)



*hpack was configured with a table-size=4096, the hpack results come from nghttp2 deflatehd

**ezflate was configured with window-bits=12 (4096 window), mem-level=6 (default), level=6 (default -> level is a non-factor for ezflate)



Regards,

Keith & Chris



[1] https://github.com/http2/http_samples





This email message is intended only for the use of the named recipient. Information contained in this email message and its attachments may be privileged, confidential and protected from disclosure. If you are not the intended recipient, please do not read, copy, use or disclose this communication to others. Also please notify the sender by replying to this message and then delete it from your system.

Received on Monday, 2 June 2014 19:23:42 UTC