Re: Security concern about open range integers (was: Question about: 4.1.1 Integer representation)

Hello,
let me express it a bit differently "the well known and well supported format" is an universal integer encoding (closely related to a 7, 7, ∞ start-step-stop code) this means it can encode any integer of unknown size including ridiculously large ones.
Know how is a server supposed to react if it receives a malicious header that states that the next field is FF FF FF FF FF FF FF FF 7F bytes long (thats 2^56 + 255 if I'm not wrong)? Naive implementations will probably have a shock trying to decode this in a fixed size 32-bits int (add a couple of FF and it overruns a 64-bits int) or allocating memory for the field in question, others may discard it, but at which point take the cut between legitimate large values and malicious ones? My concern is that every 8+ or whatever+ is a potential memory bomb if its not capped at some point.

My initial concern was rather a speed concern (read micro-optimization to speed-up decoding) concerning "the well known and well supported format" since a capped version of such a code could be processed faster, encoding is always relatively fast decoding is a bit more tricky since the bits recorded first are groups of 7 least significant bits whereas a real start-step-stop code would keep the natural bit order and would not require bit shifts (just byte swaps eventually).

Regards
-- 
Frédéric Kayser

Roberto Peon wrote:

> The rest of the int is in a well known and well supported format.
> 
> -=R
> 
> On Oct 18, 2013 1:03 AM, "Frédéric Kayser" <f.kayser@free.fr> wrote:
> And wouldn't it be way easier and faster to record/reconstruct such integers by moving all the flags (most significant bits) of each byte in front of the code?
> 
> I < 128
> 0xxxxxxx
> I < 16.384
> 10xxxxxx xxxxxxxx  (compared to 1yyyyyyy 0yyyyyyy groups of 7 least significant bits first)
> I < 2.097.152
> 110xxxxx xxxxxxxx xxxxxxxx  (compared to 1yyyyyyy 1yyyyyyy 0yyyyyyy groups of 7 least significant bits first)
> I < 268.435.456
> 1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx  (compared to 1yyyyyyy 1yyyyyyy 1yyyyyyy 0yyyyyyy groups of 7 least significant bits first)
> …
> 
> Where xxx…xxx is simply the binary representation of I
> 
> The first part looks like unary coding and the overall form of this encoding is a 7, 7, ∞ start-step-code or a base 128 Elias γ′ code, do you really want to keep the universal nature of such a code or cap it?
> 
> Regards
> -- 
> Frédéric Kayser

Received on Monday, 21 October 2013 00:19:00 UTC