Re: Security concern about open range integers (was: Question about: 4.1.1 Integer representation) from James M Snell on 2013-10-21 (ietf-http-wg@w3.org from October to December 2013)

From: James M Snell <jasnell@gmail.com>
Date: Sun, 20 Oct 2013 17:58:41 -0700
To: Roberto Peon <grmocg@gmail.com>
Cc: Frédéric Kayser <f.kayser@free.fr>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABP7Rbfq1aNBHgf119SUAwQ07p3d52vi9T=ryArM3S9f63oiSg@mail.gmail.com>

While this is true, it would likely be worthwhile to include a short
word of caution in the spec. For example, any >64bit value is not
likely to be very sane.

On Sun, Oct 20, 2013 at 5:23 PM, Roberto Peon <grmocg@gmail.com> wrote:
> If any value is too large, the connection should be torn down.
> The definition of 'too large' depends utterly on details that we cannot
> predict.
> For many SoC or embedded implementations, this may be 4k.
> For better provisioned clients, this could be gigabytes or terabytes.
>
> This problem already exists in SPDY, and is dealt with in the venerable
> tradition of deciding if the request is sane, and if not, dealing with it :)
>
> As for speed-- for the data frames we don't use this kind of encoding-- the
> overhead of a fixed encoding is small there.
> For a compressor, this isn't as true-- the overhead of a fixed encoding is
> much higher.
> In any case, the vast majority of the time, there will be zero speed
> difference since the first byte is all that is necessary to encode the
> index/length.
> -=R
>
>
> On Sun, Oct 20, 2013 at 5:18 PM, Frédéric Kayser <f.kayser@free.fr> wrote:
>>
>> Hello,
>> let me express it a bit differently "the well known and well supported
>> format" is an universal integer encoding (closely related to a 7, 7, ∞
>> start-step-stop code) this means it can encode any integer of unknown size
>> including ridiculously large ones.
>> Know how is a server supposed to react if it receives a malicious header
>> that states that the next field is FF FF FF FF FF FF FF FF 7F bytes long
>> (thats 2^56 + 255 if I'm not wrong)? Naive implementations will probably
>> have a shock trying to decode this in a fixed size 32-bits int (add a couple
>> of FF and it overruns a 64-bits int) or allocating memory for the field in
>> question, others may discard it, but at which point take the cut between
>> legitimate large values and malicious ones? My concern is that every 8+ or
>> whatever+ is a potential memory bomb if its not capped at some point.
>>
>> My initial concern was rather a speed concern (read micro-optimization to
>> speed-up decoding) concerning "the well known and well supported format"
>> since a capped version of such a code could be processed faster, encoding is
>> always relatively fast decoding is a bit more tricky since the bits recorded
>> first are groups of 7 least significant bits whereas a real start-step-stop
>> code would keep the natural bit order and would not require bit shifts (just
>> byte swaps eventually).
>>
>> Regards
>> --
>> Frédéric Kayser
>>
>> Roberto Peon wrote:
>>
>> The rest of the int is in a well known and well supported format.
>>
>> -=R
>>
>> On Oct 18, 2013 1:03 AM, "Frédéric Kayser" <f.kayser@free.fr> wrote:
>>>
>>> And wouldn't it be way easier and faster to record/reconstruct such
>>> integers by moving all the flags (most significant bits) of each byte in
>>> front of the code?
>>>
>>> I < 128
>>> 0xxxxxxx
>>> I < 16.384
>>> 10xxxxxx xxxxxxxx  (compared to 1yyyyyyy 0yyyyyyy groups of 7 least
>>> significant bits first)
>>> I < 2.097.152
>>> 110xxxxx xxxxxxxx xxxxxxxx  (compared to 1yyyyyyy 1yyyyyyy 0yyyyyyy
>>> groups of 7 least significant bits first)
>>> I < 268.435.456
>>> 1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx  (compared to 1yyyyyyy 1yyyyyyy
>>> 1yyyyyyy 0yyyyyyy groups of 7 least significant bits first)
>>> …
>>>
>>> Where xxx…xxx is simply the binary representation of I
>>>
>>> The first part looks like unary coding and the overall form of this
>>> encoding is a 7, 7, ∞ start-step-code or a base 128 Elias γ′ code, do you
>>> really want to keep the universal nature of such a code or cap it?
>>>
>>> Regards
>>> --
>>> Frédéric Kayser
>>
>>
>

Received on Monday, 21 October 2013 00:59:28 UTC