Re: Choosing a header compression algorithm from Roberto Peon on 2013-03-28 (ietf-http-wg@w3.org from January to March 2013)

From: Roberto Peon <grmocg@gmail.com>
Date: Thu, 28 Mar 2013 15:16:53 -0400
To: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
Cc: "agl@google.com" <agl@google.com>, Mark Nottingham <mnot@mnot.net>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Message-ID: <CAP+FsNee45EUbzRyF+58gkytX1WvD_cPJu_GW7N7YKE5NhszVw@mail.gmail.com>
The net effect of the character-delimited prefix match you propose is that
an attacker must only brute force each subsequence which matches [^/&=,
]*([^/&=, ]|$) ,

Running the numbers, that is:

k^s * m

where:

k = possible characters
s = subsequence length or characters not ending in [^/&= ,] or end-of-line
m = number of subsequences



instead of full atom matching, which is (is a good special case of above
where m is always 1 and s is always n):

k^n

where:

k = possible characters
n = length of field



In other words, full atom matching is *exponentially* more difficult (a
desirable trait)!

In the example above, assuming 27 commonly used characters per entry, I'm
very likely to completely discover that data in:

http://www.example.com/path/first/myfile

in:

27^4 + 27^5 + 27^6 tries


Note that it is likely that an attacker would be able to do substantially
better than above if they know the letter frequency (very likely) or have
similar data:
In the case of a whole atom match, I'd discover the data in:

27^(17) tries


So, full atom matching is ~ 5 quadrillion (5,353,441,417,462,320) times
more difficult... that is a lot.


When I originally considered prefix matching in the past, this was the math
that I did, and I decided that I was not confident that compressor
implementors would ensure that their encoder prevents prefix matching when
the subsequence lengths are too small (and thus easily attacked). The same
consideration applies to dynamic huffman coding. It *CAN* be safe, but it
requires complexity and computation, and I think there is significant risk
that implementors may knowing or unknowingly sacrifice security. It is
actually more safe to ignore delimiters and allow for a match of a prefix
so long as it is greater than a certain length. At least in that case there
is guarantee of the number of times that it must be brute forced...

Full-atom matching is simply much more difficult to get wrong from a
security perspective, and it results in pretty decent compression...

-=R



On Thu, Mar 28, 2013 at 1:00 PM, RUELLAN Herve
<Herve.Ruellan@crf.canon.fr>wrote:

> > -----Original Message-----
> > From: Roberto Peon [mailto:grmocg@gmail.com]
> > Sent: jeudi 28 mars 2013 01:15
> > To: RUELLAN Herve
> > Cc: agl@google.com; Mark Nottingham; ietf-http-wg@w3.org Group
> > Subject: Re: Choosing a header compression algorithm
> >
> > I've checked in some changes to delta2 which expands and documents
> > various options for delta2 in the README.md.
> >
> > After running a number of variations of delta2, The following defaults
> look
> > good for small buffer sizes:
> >
> >
> > delta2=max_entries=256, small_index=1
> >
> > small_index basically says use a uint8 instead of a uint16 for
> representing
> > indices, and is the kind of thing that could be messaged somewhere
> (opcode,
> > flag, whatever).
> >
> > The best headerdiff option which I believe is safe against CRIME in the
> future
> > is:
> >   headerdiff=delta_type=false,Huffman
>
> I think that for headerdiff, the best option which is safe against CRIME
> is:
> headerdiff=delta_type='/&= \coma',Huffman
>
> >
> > I removed prefix matching from delta some months ago (~6 I think?) after
> > cogitating on it for a while and then speaking with security folks.. I
> just
> > couldn't come up with a way I could prove was safe, unlike the atom-
> > matching, which one can prove is no worse than a brute-force attack.
>
> The limited prefix matching defined above also need a brute-force attack
> to be broken.
>
> Hervé.
>
> >
> > I've appended runs with these values@4k buffer size for delta2 and
> > headerdiff below.
> > -=R
> >
> >
> >
> >
> > * TOTAL: 5949 req messages
> >
>                                               size  time | ratio
> > min   max   std
> >
>                                http1     3,460,925  0.13 |
> > 1.00  1.00  1.00  0.00
> >   delta2 (max_byte_size=4096, max_entries=256, small_index=1,
> > hg_adjust=0, implicit_hg_add=0, refcnt_vals=0)       664,683  4.16 |
> 0.19  0.02
> > 0.83  0.15
> >                                                          headerdiff
> (buffer=4096, delta_type=false,
> > huffman)       759,783  2.03 | 0.22  0.01  0.78  0.18
> >
> >
> > * TOTAL: 5948 res messages
> >
>                                               size  time | ratio
> > min   max   std
> >
>                                http1     2,186,162  0.12 |
> > 1.00  1.00  1.00  0.00
> >   delta2 (max_byte_size=4096, max_entries=256, small_index=1,
> > hg_adjust=0, implicit_hg_add=0, refcnt_vals=0)       585,475  5.32 |
> 0.27  0.02
> > 1.28  0.13
> >                                                          headerdiff
> (buffer=4096, delta_type=false,
> > huffman)       543,047  3.29 | 0.25  0.02  0.73  0.14
> >
> >
> >
> >
>
>
Received on Thursday, 28 March 2013 19:17:21 UTC