Re: h2 header field names

On Fri, Sep 05, 2014 at 07:52:56AM +0000, Poul-Henning Kamp wrote:
> --------
> In message <op.xloh5ewriw9drz@beryllium.bredbandsbolaget.se>, "Martin Nilsson" 
> writes:
> >On Thu, 04 Sep 2014 21:34:50 +0200, Poul-Henning Kamp <phk@phk.freebsd.dk>  
> >wrote:
> >
> >>
> >> So I think we should make it a goal, duely prioritized, that HPACK
> >> work well with base64 (it already does, but mostly by accident)
> >> and advertise this decision.
> >>
> >
> 
> Right now the huffman codes for the base64 set has an average length
> of 6.92 bits (min=4, max=10).
> 
> This means that if we posit a hypothetical encoding of 2 bytes overhead
> and 8-bit binary encoding, it would beat our current huffman table
> once you have more than 104 bits in your binary blob.
> 
> Improving the average to 6.75 would move the cut-over to 128 bits
> 
> Improving the average to 6.38 would move the cut-over to 256 bits

One can do a bit better by optimizing the character set.


Over token characters, the optimal 64-chacter set is:
0123456789ABCDEFGHIJKLMNOPQRSTUVWYabcdefghijklmnopqrstuvwxyz-_.%

~6.33 b/c => ~1B/146.3b. (~6.36 and ~133.6 if % -> Z).


The theoretical limit for expansion is ~1B/457.9b.



Over quoted characters, the optimal set is:
0123456789ABCDEFGHIJKLMNOPQRSTUVWYabcdefghijklmnopqrstuvw /=-_.%

~6.28 b/c => ~1B/170.7b (~6.30 and ~161.7 if % -> z).

But due to having the fixed +5 bytes from token encoding, this
needs great amounts of data to break even.


The theoretical limit for expansion is ~1B/29530b.



-Ilari

Received on Friday, 5 September 2014 18:46:29 UTC