W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2014

Re: h2 header field names

From: Poul-Henning Kamp <phk@phk.freebsd.dk>
Date: Fri, 05 Sep 2014 07:52:56 +0000
To: "Martin Nilsson" <nilsson@opera.com>
cc: ietf-http-wg@w3.org
Message-ID: <25075.1409903576@critter.freebsd.dk>
--------
In message <op.xloh5ewriw9drz@beryllium.bredbandsbolaget.se>, "Martin Nilsson" 
writes:
>On Thu, 04 Sep 2014 21:34:50 +0200, Poul-Henning Kamp <phk@phk.freebsd.dk>  
>wrote:
>
>>
>> So I think we should make it a goal, duely prioritized, that HPACK
>> work well with base64 (it already does, but mostly by accident)
>> and advertise this decision.
>>
>
>It's not an accident. It was trained with real headers, and a substantial  
>part of headers are base64-encoded binary data.

Correct, but precisly what ratio of base64 that training data
contains relative to other traffic is unknown, the training set was
a very random (in the statistical sense) sample of all HTTP traffic.

Also I don't recall ever seeing the actual process used to derive
the huffman table documented, and suspect that it should be documented
and reviewed before we finalize the huffman table.

For instance I'd like to know if "Transfer-Encoding: chunked",
"Connection: {closed|keep-alive}" and similar HTTP/1-only
headers were eliminated from the set ?

If, as it looks to me, more and more traffic gets infected by cookies
with crypto-blobs, artificially boosting the base64 charset in the
huffman table could make sense:  It would amount to a gradual
improvement of HPACK as usage changes over time.

Right now the huffman codes for the base64 set has an average length
of 6.92 bits (min=4, max=10).

This means that if we posit a hypothetical encoding of 2 bytes overhead
and 8-bit binary encoding, it would beat our current huffman table
once you have more than 104 bits in your binary blob.

Improving the average to 6.75 would move the cut-over to 128 bits

Improving the average to 6.38 would move the cut-over to 256 bits

But in both cases it would come at an expense of compression
efficiency of other headers.

If on the other hand we want to discourage huge cookies, going
binary would be the worst thing we could do.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
Received on Friday, 5 September 2014 07:53:19 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 30 March 2016 09:57:10 UTC