W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2013

Re: bohe implementation for compression tests

From: Roberto Peon <grmocg@gmail.com>
Date: Mon, 14 Jan 2013 21:34:20 -0800
Message-ID: <CAP+FsNfUHf6bUBoUF-qUbuadZzvG6BLiTuJUVtFdL3ERiVDOyg@mail.gmail.com>
To: James M Snell <jasnell@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
I'm trying to express, I guess, that as the number of buckets goes up, the
compression becomes correspondingly less effective, and it has a limited
(eventually constant) security benefit after enough requests.

I believe this ends up being a binomial distribution thus, for k buckets
and k requests I believe that we end with a probability of hitting a bucket
that does not have the sensitive data of only 1/e (increasing security by
roughly 36%) while decreasing compression effectiveness by strictly worse
than 54% (i.e. 1-1/e at the limit).

The attackers are better math geeks than I am. This may explain my wariness
to play with any kind of stream compressor and make assertions about safety
:)
-=R


On Mon, Jan 14, 2013 at 7:07 PM, James M Snell <jasnell@gmail.com> wrote:

> The randomization is not intended to make reverse engineering impossible,
> just a large degree more difficult (and potentially more impractical). Used
> in combination with other strategies -- such as moving away from the use of
> unauthenticated session ids in cookies -- it could prove effective enough
> to get the job done.
>
>
> On Mon, Jan 14, 2013 at 7:01 PM, Roberto Peon <grmocg@gmail.com> wrote:
>
>> An attacker would run its experiment some increased number of times and
>> still get information about what is in the context, assuming i am
>> understanding it correctly. :/
>>
>> -=R
>> On Jan 14, 2013 5:33 PM, "James M Snell" <jasnell@gmail.com> wrote:
>>
>> Just continuing the investigation on various compression strategies. I
>> spent part of the day going through delta to make sure I understand it and
>> how it compares with bohe... I'll have some additional thoughts (and
>> concerns) with regards to that later on... The other half of the day has
>> been spent with various other bohe variations. Late in the after I hit upon
>> a particularly interesting variation... I've checked it in here:
>> https://github.com/jasnell/compression-test/tree/master/compressor/bohe4
>>
>> This variation encodes headers and randomly assigns them to one of two
>> separate buckets. Those are then randomly ordered and compressed using two
>> separate compressor instances within the header block...
>>
>> # +-------------+--------------------------+
>> # | num_headers |   block 1 len (4 bytes)  |
>> # +-------------+--------------------------+
>> # |        compressed header block 1       |
>> # +----------------------------+-----------+
>> # |  block 2 len (4 bytes)     |           |
>>  # +----------------------------+           |
>> # |        compressed header block 2       |
>> # +----------------------------+-----------+
>>
>> Because of the randomization, there is no way of determining in advance
>> which block any individual piece of data will land... making it much harder
>> for an attacker to use the compression ratio to reverse engineer any
>> particular value... every time the information is sent, it can be in a
>> different location. You can take the exact same request and encode it
>> multiple times and end up with a different message size every time (up to a
>> given limit, of course).
>>
>> Some numbers from various test runs... note how bohe4 produces variable
>> compression ratios given identical input.
>>
>> ./compare_compressors.py -c bohe -c bohe4 -c delta -t
>> /Users/james/git/http_samples/mnot/wikipedia.org.har
>> 408 req messages processed
>>              compressed | ratio min   max   std
>> req  bohe        10,784 | 0.13  0.05  0.65  0.07
>> req bohe4        13,496 | 0.16  0.05  0.69  0.08
>>  req delta        16,725 | 0.20  0.04  0.72  0.09
>> req http1        84,388 | 1.00  1.00  1.00  0.00
>>
>> 408 res messages processed
>>              compressed | ratio min   max   std
>> res  bohe        19,882 | 0.25  0.06  0.58  0.10
>> res bohe4        20,610 | 0.26  0.09  0.63  0.09
>> res delta        24,523 | 0.30  0.04  0.60  0.12
>> res http1        80,613 | 1.00  1.00  1.00  0.00
>>
>> ./compare_compressors.py -c bohe -c bohe4 -c delta -t
>> /Users/james/git/http_samples/mnot/wikipedia.org.har
>> 408 req messages processed
>>              compressed | ratio min   max   std
>> req  bohe        10,784 | 0.13  0.05  0.65  0.07
>> req bohe4        13,820 | 0.16  0.07  0.67  0.08
>> req delta        16,725 | 0.20  0.04  0.72  0.09
>> req http1        84,388 | 1.00  1.00  1.00  0.00
>>
>> 408 res messages processed
>>              compressed | ratio min   max   std
>> res  bohe        19,882 | 0.25  0.06  0.58  0.10
>> res bohe4        21,644 | 0.27  0.09  0.61  0.09
>> res delta        24,523 | 0.30  0.04  0.60  0.12
>> res http1        80,613 | 1.00  1.00  1.00  0.00
>>
>> Again, this is just intended as fodder for discussion right now. I'll
>> have some comments specifically on delta encoding tomorrow sometime.
>>
>> - James
>>
>>
>> On Thu, Jan 10, 2013 at 11:08 AM, James M Snell <jasnell@gmail.com>wrote:
>>
>>> I have an initial bohe implementation for the compression tests... it's
>>> very preliminary and uses the same gzip compression as the current spdy3.
>>> I'm going to be playing around with the delta compression mechanism as well
>>> and see how much of an impact that has. Initial results are very promising
>>> but I haven't done much debugging yet. Just wanted folks to know that this
>>> work was underway...
>>>
>>> https://github.com/jasnell/compression-test/tree/master/compressor/bohe
>>>
>>> Some test runs....
>>>
>>> ./compare_compressors.py -c bohe -c spdy3 -c delta
>>> ../http_samples/mnot/amazon.com.har
>>> 732 req messages processed
>>>              compressed | ratio min   max   std
>>> req  bohe        26,122 | 0.13  0.04  0.70  0.08
>>> req delta        33,955 | 0.17  0.02  0.71  0.09
>>> req http1       195,386 | 1.00  1.00  1.00  0.00
>>> req spdy3        27,238 | 0.14  0.04  0.71  0.08
>>>
>>> 732 res messages processed
>>>              compressed | ratio min   max   std
>>> res  bohe        39,628 | 0.25  0.04  0.66  0.07
>>> res delta        44,499 | 0.28  0.02  0.65  0.09
>>> res http1       159,968 | 1.00  1.00  1.00  0.00
>>> res spdy3        41,325 | 0.26  0.04  0.67  0.08
>>>
>>>
>>> ./compare_compressors.py -c bohe -c spdy3 -c delta
>>> ../http_samples/mnot/craigslist.org.har
>>> 66 req messages processed
>>>              compressed | ratio min   max   std
>>> req  bohe         1,948 | 0.15  0.06  0.73  0.11
>>> req delta         2,036 | 0.16  0.07  0.71  0.11
>>> req http1        12,894 | 1.00  1.00  1.00  0.00
>>> req spdy3         2,016 | 0.16  0.07  0.75  0.11
>>>
>>> 66 res messages processed
>>>              compressed | ratio min   max   std
>>> res  bohe         1,786 | 0.18  0.07  0.77  0.13
>>> res delta         2,858 | 0.28  0.08  0.69  0.12
>>> res http1        10,147 | 1.00  1.00  1.00  0.00
>>> res spdy3         1,869 | 0.18  0.09  0.78  0.13
>>>
>>>
>>> ./compare_compressors.py -c bohe -c spdy3 -c delta
>>> ../http_samples/mnot/flickr.com.har
>>> 438 req messages processed
>>>              compressed | ratio min   max   std
>>> req  bohe        11,988 | 0.10  0.02  0.69  0.07
>>> req delta        26,372 | 0.22  0.01  0.71  0.14
>>> req http1       121,854 | 1.00  1.00  1.00  0.00
>>> req spdy3        12,550 | 0.10  0.02  0.71  0.07
>>>
>>> 438 res messages processed
>>>              compressed | ratio min   max   std
>>> res  bohe        13,073 | 0.09  0.05  0.66  0.06
>>> res delta        25,236 | 0.18  0.02  0.70  0.11
>>> res http1       140,457 | 1.00  1.00  1.00  0.00
>>> res spdy3        14,142 | 0.10  0.05  0.66  0.06
>>>
>>>
>>> ./compare_compressors.py -c bohe -c spdy3 -c delta
>>> ../http_samples/mnot/facebook.com.har
>>> 234 req messages processed
>>>              compressed | ratio min   max   std
>>> req  bohe         6,091 | 0.15  0.06  0.78  0.07
>>> req delta         7,800 | 0.19  0.02  0.70  0.07
>>> req http1        41,980 | 1.00  1.00  1.00  0.00
>>> req spdy3         6,301 | 0.15  0.06  0.77  0.07
>>>
>>> 234 res messages processed
>>>              compressed | ratio min   max   std
>>> res  bohe         9,458 | 0.23  0.07  0.68  0.07
>>> res delta        12,045 | 0.30  0.13  0.60  0.08
>>> res http1        40,252 | 1.00  1.00  1.00  0.00
>>> res spdy3         9,788 | 0.24  0.07  0.69  0.07
>>>
>>>
>>>
>>>
>>>
>>
>
Received on Tuesday, 15 January 2013 05:35:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 15 January 2013 05:35:20 GMT