Re: Multiple Huffman code tables from 姓名 on 2023-12-09 (ietf-http-wg@w3.org from October to December 2023)

From: 姓名 <falsandtru@gmail.com>
Date: Sat, 9 Dec 2023 23:30:40 +0900
To: Willy Tarreau <w@1wt.eu>, ietf-http-wg@w3.org
Message-ID: <CA+isZAJeZLRqAcXnu=NDcEwS3gxANTzvjpJ3DD0=s0vO44w7PQ@mail.gmail.com>
> Huh ? No sure what you mean.

> Please stop rehashing this non-sense. I'm trying to help you get your
> proposal easier to review and understand. If you want to insult me all
> the time, go find someone else to review it.

Many of the problems you point out are not unique to my proposal, as they
would occur even if compression ratios were improved in other ways.
Therefore, your points are largely irrelevant to my proposal. And I'm not
sure, is your review essential?

> 50-100 bytes per what ? Per header ? per request ? Per 10kB of headers
> sent ? You just sent raw numbers without *any* explanation. I read 1.33
> vs 1.29 as the average compression ratios, which sound reasonable since
> large values are essentially made of base64 values hence only have 0.75
> bytes of entropy. For 100 bytes to be saved at such ratios, it would
> require roughly 4200 non-indexable bytes to be sent. Is that a nice
> improvement ? Maybe. Does anybody care about having to maintain a > second
> implementations to save that now that the protocol is widely deployed ?
> I'm much less sure.

> As asked abvoe, per what unit, and under which scenario ? You're
> proposing something, you need to back it up with data.

A header or a single field. Next is one of them. This is the cookie value
that is sent when I'm logged into Google. It is secrets, so the details are
omitted.

-606, 'SOCS=...'

It is painful to explain logically to your emotional opinion. If you want
to insult me all the time, go find someone else.

> The tables here are just maps between sets of bytes. Also you *say* that
> you can remove them but your example code has plenty, which is counter-
> intuitive. You just dumped your code here with raw data without any
> explanation about what is supposed to make it better.

I say it can be replaced. It means it can be converted.

> The version of what ? ...

As mentioned above, this is not a problem specific to my proposal.

2023年12月9日(土) 22:13 Willy Tarreau <w@1wt.eu>:

> On Sat, Dec 09, 2023 at 09:42:54PM +0900, ?? wrote:
> > > I'm not saying that the *implementation* is complex. However, for a
> > low-level
> > > protocol change to be effective, it must be widely adopted, and
> > modifications
> > > applied to most major stacks. And for an implementation, having to
> > support two
> > > variants instead of one necessarily adds a little bit of complexity
> (even
> > in
> > > interoperability testing), so there really needs to be a good argument
> for
> > > this.
> >
> > It appears that you are refusing to change the Huffman code.
>
> Huh ? No sure what you mean.
>
> > > It depends 2.5% of what. Here we're speaking about 2.5% of something
> > > already tiny. If I read it well, we're suggesting that the *first*
> > > occurrence of a 133-byte headers is reduced to 129 bytes. When that
> > > happens in a series of 100 requests, that's only 0.04 bytes saved per
> > > request on average. Don't get me wrong, I'm not saying it's nothing,
> I'm
> > > saying that all factors must be considered. As I explained, more
> savings
> > > could be gained by revisiting the HPACK opcode encoding, that will save
> > > bytes for each and every header field for each and every request, not
> > > just the first one. And keep in mind that some implementations do no
> > > even compress outgoing headers because the savings are not considered
> > > worth the cost (particularly on the response direction).
> >
> > It is the compression ratio in the compression algorithm(before / after).
> > It is not the number of bytes. The number of bytes reduced by them is
> > 50-100 bytes at a time. Is that too little?
>
> 50-100 bytes per what ? Per header ? per request ? Per 10kB of headers
> sent ? You just sent raw numbers without *any* explanation. I read 1.33
> vs 1.29 as the average compression ratios, which sound reasonable since
> large values are essentially made of base64 values hence only have 0.75
> bytes of entropy. For 100 bytes to be saved at such ratios, it would
> require roughly 4200 non-indexable bytes to be sent. Is that a nice
> improvement ? Maybe. Does anybody care about having to maintain a second
> implementations to save that now that the protocol is widely deployed ?
> I'm much less sure.
>
> > The number of bytes is reduced in the response as well.
>
> OK but in practice nobody cares about these ones since they come with
> tens to hundreds of kB of extra data.
>
> > > HPACK compression is extremely effective on the uplink from the client
> > > to the server, where it gains most savings by using the dynamic table
> > > and compresses to a single-byte most repetitive header fields,
> including
> > > large cookies. Huffman here is just a nice extra bonus but not a major
> > > difference.
> >
> > Are improvements to standardized extras prohibited? It should not be.
>
> I don't understand why you're stating this. I suspect it will be difficult
> to discuss based on technical grounds... What is important to understand
> is that you cannot improve standards by breaking them, so it must always
> be done in a backwards-compatible way (i.e. the need for discovering the
> support on the other side).
>
> > As
> > mentioned above, this extra has been reduced by 50-100 bytes. This should
> > be an improvement worth proposing.
> >
> > > That's basically what most of us are already doing I think, e.g:
> >
> > No. Your code defines a table. I say that such can be removed altogether.
>
> The tables here are just maps between sets of bytes. Also you *say* that
> you can remove them but your example code has plenty, which is counter-
> intuitive. You just dumped your code here with raw data without any
> explanation about what is supposed to make it better.
>
> > > That's not what I'm saying. I'm saying that for a client to use your
> > > implementation, it must first know that the server will support it, and
> > > it cannot know this before receiving its SETTINGS frame, hence it's not
> > > usable before the first round-trip, which is where most of the huffman
> > > savings matter.
> >
> > It appears that you are refusing to change the Huffman code.
>
> Please stop rehashing this non-sense. I'm trying to help you get your
> proposal easier to review and understand. If you want to insult me all
> the time, go find someone else to review it.
>
> > Need a signal other than the version?
>
> The version of what ? It seems that you need an explanation of how HTTP
> works. First a client connects to a server and advertises the protocols
> it's willing to speak using ALPN. The server responds with its ALPN
> string as well. From this point the client knows it can use H2 to speak
> to the server, it sends its preface, SETTINGS frame, and a bunch of
> requests in a conservative way (i.e. assuming the server is OK with
> default settings). Then the server sends its SETTINGS, SETTINGS ACK,
> and starts processing the received requests.
>
> Here, assuming your client wants to use a new version of the huffman
> encoder, it would need to advertise its support using a SETTINGS frame,
> and couldn't use it until it sees the server's SETTINGS frame that
> indicates that it supports it. It's *only* at this point that it will
> be able to switch to the new version. A whole round trip will have been
> lost, with up to 14 kB of data uploaded at once. Once you've spent your
> initial time in the first round trip there's much less to gain later,
> because the first round trip is where you're trying to reduce the amount
> of data to make sure not to waste a round trip.
>
> > > Now, feel free to prove me wrong with real world examples where you
> > > observe significant changes on the volume of bytes sent by a client
> > > before and after your changes, with an emphasis on the first 10 MSS
> > > (14kB) which is where a first round trip will be needed, but at first
> > > glance I'm pretty sure this will be fairly marginal.
> >
> > I cannot know the scenario in your head.
>
> There's no scenario in my head, I'm speaking about a client sending many
> requests over a just established TCP connection, and using compression
> to save bytes and try to save time by avoiding a round trip, which is
> the whole point of headers compression.
>
> > But as mentioned above, this extra
> > has been reduced by 50-100 bytes at a time. Is that too little?
>
> As asked abvoe, per what unit, and under which scenario ? You're
> proposing something, you need to back it up with data.
>
> > > You're welcome, many of us are not english natives either :-)
> >
> > I did not say I am not a native English speaker. I said I am not good at
> > English :-)
>
> I know, but usually that comes together.
>
> Willy
>
Received on Saturday, 9 December 2023 14:31:24 UTC