W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2019

Re: New I-D: Security Considerations Regarding Compression Dictionaries

From: Soni L. <fakedme+http@gmail.com>
Date: Wed, 30 Oct 2019 06:43:20 -0300
To: ietf-http-wg@w3.org
Message-ID: <c5e37168-b958-8b13-ab97-f9a7f5352b24@gmail.com>

On 2019-10-29 11:17 p.m., W. Felix Handte wrote:
> On 10/29/19 7:54 PM, Watson Ladd wrote:
> > I'm not sure I appreciate the distinction of "dictionary-based"
> > compression vs. other compression algorithms you draw in the draft.
> > The BREACH attack didn't look at changes to the Huffman table, which
> > was dominated by good old ETOAIN SHRDLU. Instead it changed the length
> > of matches back into the datastream, and thus the length of the
> > observed output. There isn't a separate dictionary to match substrings
> > in in DEFLATE.
> Yeah, "dictionary" is an extremely overloaded term in compression, to
> say nothing of computer science generally. This has produced a great
> deal of confusion. But I haven't come up with a better term for the concept.
> What I mean by "dictionary" in this context is mostly a user-supplied
> buffer that the compressor makes LZ77-style matches into. (Though, as
> the document notes, various algorithms expect various kinds of data in
> the dictionaries they accept.) This makes it very much a potential
> vector for exactly that kind of attack. And DEFLATE, at least as
> implemented by zlib, does support dictionaries of this form [0].
> > A perfect compression algorithm reveals the Kolmogorov complexity of
> > the input. This is enough (if you can compute Kolmogorov complexity)
> > to reveal the differences between "hunter2 h" and "hunter2 z", and
> > then "hunter2 hu" and "hunter2 ha", etc.
> Right, compression as it exists today has real outstanding security
> issues. My goal with the document is to assess whether the use of
> dictionaries introduces additional problems on top of the existing ones.
> [0] https://github.com/madler/zlib/blob/master/zlib.h#L611
So, what you're saying, is that this wouldn't be an issue if we were 
using public-key-based authentication and session tokens?

Like this? https://soniex2.autistic.space/posts/2019/06/uweb.xhtml (or, 
perhaps, this? https://awoo.space/@SoniEx2/102972533369915352 )

(This also touches on another post I've sent this list, but it's not 
relevant here.)
Received on Wednesday, 30 October 2019 09:43:27 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:15:43 UTC