- From: Willy Tarreau <w@1wt.eu>
- Date: Sat, 9 Dec 2023 10:53:46 +0100
- To: falsandtru@gmail.com
- Cc: ietf-http-wg@w3.org
On Sat, Dec 09, 2023 at 05:58:43PM +0900, ?? wrote: > > I seem to remember that the overall feeling was that gains to > > be expected there were not significant enough to warrant more complexity. > > This proposal addresses the current situation where tokens have greatly > increased header size for security reasons. The situation is different from > the past. > > > again the extra complexity was considered as an > > obstacle and in general it seems that there's not that much interest in > > squeezing slightly more bytes there. > > There is nothing particularly complex about this algorithm. The return is > commensurate with less complexity, as discussed below. I'm not saying that the *implementation* is complex. However, for a low-level protocol change to be effective, it must be widely adopted, and modifications applied to most major stacks. And for an implementation, having to support two variants instead of one necessarily adds a little bit of complexity (even in interoperability testing), so there really needs to be a good argument for this. > > If I read this right, it seems to me that this corresponds just to a delta > > of 2 bytes for a total of 500 bytes of data. That's really small. > > That is an example of low compression. Compression ratio improves by more > than 1% for the response of Google's home page in non-logged-in state. > > 'XPACK comp. ratio response', 0.25389886578449905, 1.340300870942201 > 'HPACK comp. ratio response', 0.24155245746691867, 1.3184827478775605 > > The compression ratio improves by 2.5% when logged in. > > 'XPACK comp. ratio request', 0.24189189189189186, 1.3190730837789661 > 'HPACK comp. ratio request', 0.21498410174880767, 1.2738595514151183 > > Compression is also improved by 2.5% on the Amazon home page. Is a 2.5% > improvement small? > > 'XPACK comp. ratio request', 0.24909539473684206, 1.3317270835614938 > 'HPACK comp. ratio request', 0.22467105263157894, 1.2897751378871447 It depends 2.5% of what. Here we're speaking about 2.5% of something already tiny. If I read it well, we're suggesting that the *first* occurrence of a 133-byte headers is reduced to 129 bytes. When that happens in a series of 100 requests, that's only 0.04 bytes saved per request on average. Don't get me wrong, I'm not saying it's nothing, I'm saying that all factors must be considered. As I explained, more savings could be gained by revisiting the HPACK opcode encoding, that will save bytes for each and every header field for each and every request, not just the first one. And keep in mind that some implementations do no even compress outgoing headers because the savings are not considered worth the cost (particularly on the response direction). HPACK compression is extremely effective on the uplink from the client to the server, where it gains most savings by using the dynamic table and compresses to a single-byte most repetitive header fields, including large cookies. Huffman here is just a nice extra bonus but not a major difference. > > Just think that being able to advertise the > > use and support of the new table would likely require more bytes > > The Huffman code for tokens is so regular that no table or tree is needed. > It is replaceable with conditional expressions. That's basically what most of us are already doing I think, e.g: https://github.com/haproxy/haproxy/blob/master/src/hpack-huff.c#L784 > > couldn't be used before the first > > round trip, which is where it matters the most. > > Perhaps you misunderstand. The initial state of the Huffman code is fixed > and invariant. It changes state only during encoding/decoding. That's not what I'm saying. I'm saying that for a client to use your implementation, it must first know that the server will support it, and it cannot know this before receiving its SETTINGS frame, hence it's not usable before the first round-trip, which is where most of the huffman savings matter. Now, feel free to prove me wrong with real world examples where you observe significant changes on the volume of bytes sent by a client before and after your changes, with an emphasis on the first 10 MSS (14kB) which is where a first round trip will be needed, but at first glance I'm pretty sure this will be fairly marginal. > > PS: please avoid responding to yourself multiple times and top-posting, > > that makes it difficult to respond to your messages, and likely > > further reduces the willingness to respond. > > I will try my best, but I am not good at English, so please forgive me a > little. You're welcome, many of us are not english natives either :-) Willy
Received on Saturday, 9 December 2023 09:53:53 UTC