Re: Choosing a header compression algorithm

+1 ... I have explored this a bit on this end also (and discussed it
with some security folks) and definitely align with Roberto's point of
view. At this point, any prefix matching proposal needs to be backed
with clear evidence as to it's safety.

On Fri, Mar 29, 2013 at 11:39 AM, Roberto Peon <grmocg@gmail.com> wrote:
> Herve--
>
> With the way you've implemented prefix matching, there are no guarantees on
> security whatsoever.
> In order to provide a guarantee of at least a minimum amount of effort on
> the part of the attacker, you MUST provide guarantees on the minimum
> subsequence sizes, as this relates directly to the amount of tries...
> As an example,
>   foo.com/a/b/c
> will be exceptionally easy to figure out, as each subsequence will be
> guessed in a very small amount of time.
>
> There are many caveats and gotchas in this space, and I feel uncomfortable
> that you're making assertions about safety unless you've got some security
> folks looking at it critically.
> I've done that for the delta compressor, and I suggest you do the same. If
> it turns out that we do get consensus from security folks that some kind of
> prefix matching is safe, I'm happy to add it back into delta. Until then, I
> don't want to go down the rabbit hole!
> -=R
>
>
> On Fri, Mar 29, 2013 at 8:26 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
> wrote:
>>
>>
>>
>> > -----Original Message-----
>> > From: Roberto Peon [mailto:grmocg@gmail.com]
>> > Sent: jeudi 28 mars 2013 20:17
>> > To: RUELLAN Herve
>> > Cc: agl@google.com; Mark Nottingham; ietf-http-wg@w3.org Group
>> > Subject: Re: Choosing a header compression algorithm
>> >
>> > The net effect of the character-delimited prefix match you propose is
>> > that an
>> > attacker must only brute force each subsequence which matches [^/&=,
>> > ]*([^/&=, ]|$) ,
>> >
>> > Running the numbers, that is:
>> >
>> >       k^s * m
>> >
>> > where:
>> >
>> >       k = possible characters
>> >       s = subsequence length or characters not ending in [^/&= ,] or
>> > end-
>> > of-line
>> >       m = number of subsequences
>> >
>> >
>> >
>> > instead of full atom matching, which is (is a good special case of above
>> > where
>> > m is always 1 and s is always n):
>> >
>> >       k^n
>> >
>> >
>> > where:
>> >
>> >
>> >       k = possible characters
>> >       n = length of field
>> >
>> >
>> >
>> > In other words, full atom matching is *exponentially* more difficult (a
>> > desirable trait)!
>> >
>> > In the example above, assuming 27 commonly used characters per entry,
>> > I'm
>> > very likely to completely discover that data in:
>> >
>> >       http://www.example.com/path/first/myfile
>> >
>> > in:
>> >
>> >       27^4 + 27^5 + 27^6 tries
>>
>> This is roughly 400 million tries. And this figure is obtained using
>> several assumptions that may not hold:
>> - The number of possible characters is only 27.
>> - The length of each chunk is known.
>>
>> > Note that it is likely that an attacker would be able to do
>> > substantially better
>> > than above if they know the letter frequency (very likely) or have
>> > similar
>> > data:
>> > In the case of a whole atom match, I'd discover the data in:
>> >
>> >
>> >       27^(17) tries
>> >
>> >
>> > So, full atom matching is ~ 5 quadrillion (5,353,441,417,462,320) times
>> > more
>> > difficult... that is a lot.
>>
>> True full atom matching is much harder, but limited prefix matching is
>> already very hard.
>>
>> We must also keep in mind that each try correspond to a request that the
>> attacker for the client to do. Such a huge number of requests should be
>> detected somewhere, just in order to prevent DoS attacks.
>>
>> > When I originally considered prefix matching in the past, this was the
>> > math
>> > that I did, and I decided that I was not confident that compressor
>> > implementors would ensure that their encoder prevents prefix matching
>> > when the subsequence lengths are too small (and thus easily attacked).
>> > The
>> > same consideration applies to dynamic huffman coding. It *CAN* be safe,
>> > but it requires complexity and computation, and I think there is
>> > significant
>> > risk that implementors may knowing or unknowingly sacrifice security. It
>> > is
>> > actually more safe to ignore delimiters and allow for a match of a
>> > prefix so
>> > long as it is greater than a certain length. At least in that case there
>> > is
>> > guarantee of the number of times that it must be brute forced...
>> >
>> > Full-atom matching is simply much more difficult to get wrong from a
>> > security
>> > perspective, and it results in pretty decent compression...
>>
>> I think that prefix matching with constrained ending is fairly secure: it
>> compels an attacker to used brute force to break it. In addition, it
>> provides very interesting results regarding compression.
>
>
>>
>>
>> Hervé.
>>
>> > -=R
>>
>

Received on Friday, 29 March 2013 21:10:47 UTC