- From: David Krauss <potswa@gmail.com>
- Date: Sun, 13 Apr 2014 13:49:03 +0800
- To: HTTP Working Group <ietf-http-wg@w3.org>
On 2014–04–13, at 3:01 AM, Martin Thomson <martin.thomson@gmail.com> wrote: > This shows that HPACK is pretty good at encoding cookies, but there > are small gains to be had for URI components. If character > restrictions are wildly different to these common values, this could > be even better. > > This hasn't been completely optimized, so there might be a few bits > left to save. Please see my earlier message on this topic, "Improving on Huffman” sent 3/17/14. In all the use cases, we tend to see sequences of similar characters: numeric, uppercase, lowercase. This corresponds to two bits of reliable prediction. State was eliminated from the compression for security reasons, but as I argue in the previous message, I think that giving drastically different compression characteristics to uppercase, lowercase, and binary opens more security problems than carrying over two bits from the previous character does. > I'm using Huffman encoding as the basis of this - in > reverse, of course - which isn't quite as good as arithmetic or range > coding. I could probably manage the trailing bits more elegantly. > Pull requests happily accepted. Arithmetic and range codings save only up to one bit per symbol, as I understand. The current symbols tend to be larger because the coding isn’t very efficient, but a better coding with smaller symbols would see a bigger incremental gain.
Received on Sunday, 13 April 2014 05:49:36 UTC