- From: Willy Tarreau <w@1wt.eu>
- Date: Sat, 9 Dec 2023 08:33:44 +0100
- To: falsandtru@gmail.com
- Cc: ietf-http-wg@w3.org
Hello, On Sat, Dec 09, 2023 at 07:02:20AM +0900, ?? wrote: > To all. Please read the proposal and the source code before replying with > assumptions. Well, your "proposal" is just a single paragraph with a raw dump of whoever knows what it shows, and there's zero comment in your source code to explain the intent nor what it's trying to do, there are essentially hard-coded values and code sequences. I'm sorry but you cannot expect participants to spend a lot of time trying to guess what you're trying to achieve this way. > > HPACK/QPACK uses only one static table of Huffman coding, but the > > compression ratio can be improved by combining multiple tables. There is *no > > overhead* in data size due to the combination. Theoretically, two code > > tables can reduce the bit length of one code table by an average of one > > bit, since the two code tables itself has one bit of information. *This > > proposal especially reduces the token size increasing in recent years.* Let > > me know the link to the conclusions if this approach has been considered. > > The following is a comparison based on my proof of concept code. The > > leftmost number is the reduced bit size. The relevance of the huffman table was discussed during the design phase, and even the data set that served to build the HPACK headers table as well. I vaguely remember some discussions where we wondered if we could squeeze more by having a distinct static table per direction and doing the same for huffman (since it's possible that the bytes distribution differs slightly between sides). I seem to remember that the overall feeling was that gains to be expected there were not significant enough to warrant more complexity. There was another discussion about the possibility to use FSE instead of huffman, which could bring better compression ratios, at the expense of a bit more complexity, and again the extra complexity was considered as an obstacle and in general it seems that there's not that much interest in squeezing slightly more bytes there. Ah, I just found it, here it is, along with another thread on the same topic: https://lists.w3.org/Archives/Public/ietf-http-wg/2014JanMar/0029.html https://lists.w3.org/Archives/Public/ietf-http-wg/2014AprJun/1275.html Overall the main purpose of the huffman coding here is to get closer to 6 bits per character for most common ones that are found in header field names (and values) that are not found as-is in the HPACK static table. A good example is the first emission of a large base64-encoded cookie. The purpose is not to save a lot there. Most of the savings come from the dynamic table that allows a sender to send the field name and value only once in a while. > > 'XPACK comp. ratio request', 0.2533415841584158, 1.3393005138405436 > > 'HPACK comp. ratio request', 0.2471534653465347, 1.3282919612033537 If I read this right, it seems to me that this corresponds just to a delta of 2 bytes for a total of 500 bytes of data. That's really small. Some changes to the HPACK encoding were proposed already during the design phase, that could save significantly more and were rejected as being too small to warrant a change. Just think that being able to advertise the use and support of the new table would likely require more bytes on the wire (e.g. a SETTINGS frame) and couldn't be used before the first round trip, which is where it matters the most. Regards, Willy PS: please avoid responding to yourself multiple times and top-posting, that makes it difficult to respond to your messages, and likely further reduces the willingness to respond.
Received on Saturday, 9 December 2023 07:33:52 UTC