Re: Speed concern related to bit order in Huffman (was: I-D Action: draft-ietf-httpbis-header-compression-10.txt) from Roberto Peon on 2014-12-02 (ietf-http-wg@w3.org from October to December 2014)

From: Roberto Peon <grmocg@gmail.com>
Date: Mon, 1 Dec 2014 18:21:32 -0800
To: Frédéric Kayser <f.kayser@free.fr>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAP+FsNemovm8Huy8-0x+5Zim2S3pQ7ihk0kb2C8u+=CAF+Bofg@mail.gmail.com>

The best way to influence is to show the difference :)
I remember bemoaning that little-endian won, which requires such things for
speed for bitwise operations.

-=R

On Mon, Dec 1, 2014 at 6:08 PM, Frédéric Kayser <f.kayser@free.fr> wrote:

> Hi,
> I did not follow the list for nearly a year but this aspect as apparently
> not been discussed.
> There’s something that puzzles me when I read the HPACK spec it seems to
> record the Huffman compressed data in a natural and naive way which is not
> what state of the art compressors do.
> The problem here is not space saving but rather speed and especially
> decoding speed and it may not be apparent when using high level languages
> this is rather a fine tune for C and other bare metal (assembly like)
> implementation.
>
> Bits-at-top is a different approach: Deflate has been doing this for more
> than two decades and Charles Bloom recently stated that it is still the
> fastest way to do bit level I/O (Charles Bloom is well respected in the
> data compression comunity).
>
> http://cbloomrants.blogspot.fr/2014/02/02-18-14-ansfast-implementation-notes.html
>
> If I take the sample given in HPACK 10:
>
> http://tools.ietf.org/html/draft-ietf-httpbis-header-compression-10#appendix-C.4
> "example.com" is Huffman encoded this way: f1e3 c2e5 f23a 6ba0 ab90 f4ff
>
> F1E3 C2E5 or 11110001 11100011 1100 0010 11100101 in binary and comes from:
> ‘w’ (119) 1111000
> ‘w’ (119) 1111000
> ‘w’ (119) 1111000
> '.' (46)  010111
> 'e' (101) 00101
>
> Notice that bits have been put into the bytes as they come: 1111000 <-
> 1111000 <- 1111000 <- 010111 <- 00101, dot used to show code boundary:
> 1111000.1 111000.11 1100 0.010 111.00101
>
> Bits at top does work this way around:
> xxxxxxxx xxxxxxxx xxxxxxxx x1111000
> xxxxxxxx xxxxxxxx xx111100 01111000
> -> emit 01111000
> xxxxxxxx xxxxxxxx xxxxxxxx xx111100
> xxxxxxxx xxxxxxxx xxx11110 00111100
> -> emit 00111100
> xxxxxxxx xxxxxxxx xxxxxxxx xxx11110
> xxxxxxxx xxxxxxxx xxxxx010 11111110
> -> emit 11111110
> xxxxxxxx xxxxxxxx xxxxxxxx xxxxx010
> xxxxxxxx xxxxxxxx xxxxxxxx 00101010
> -> emit 00101010
>
> gives 01111000 00111100 11111110 00101010 which is impossible to decode…
> silly me forget to reverse bit order at code level, which gives :
> ‘w’ (119) 1111000 -> 0001111
> ‘w’ (119) 1111000 -> 0001111
> ‘w’ (119) 1111000 -> 0001111
> '.' (46)  010111 -> 111010
> 'e' (101) 00101 -> 10100
>
> xxxxxxxx xxxxxxxx xxxxxxxx x0001111
> xxxxxxxx xxxxxxxx xx000111 10001111
> -> emit 10001111
> xxxxxxxx xxxxxxxx xxxxxxxx xx000111
> xxxxxxxx xxxxxxxx xxx00011 11000111
> -> emit 11000111
> xxxxxxxx xxxxxxxx xxxxxxxx xxx00011
> xxxxxxxx xxxxxxxx xxxxx111 01000011
> -> emit 01000011
> xxxxxxxx xxxxxxxx xxxxxxxx xxxxx111
> xxxxxxxx xxxxxxxx xxxxxxxx 10100111
> -> emit 10100111
>
> gives 10001111 11000111 01000011 10100111 or 8FC7 43A7,
>
> Bits-at-top looks a bit more complicated but is much closer to what CPUs
> are able to handle quickly, Deflate and now Brotli compression work this
> way and as I wrote before it’s still the way recommended by Charles Bloom
> for super fast implementation of bit level I/O.
>
> I hope it’s not too late to stop press to evaluate the real impact it
> could have on HPACK.
>
> Regards
> Frédéric Kayser
>
>
>

Received on Tuesday, 2 December 2014 02:21:59 UTC