- From: Ben Schwartz <bemasc@meta.com>
- Date: Thu, 9 Oct 2025 21:47:29 +0000
- To: Yaroslav Rosomakho <yrosomakho@zscaler.com>, Demi Marie Obenour <demiobenour@gmail.com>
- CC: Kazuho Oku <kazuhooku@gmail.com>, Amos Jeffries <squid3@treenet.co.nz>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-ID: <DS0PR15MB567498BDA1788EA1C19FF37DB3EEA@DS0PR15MB5674.namprd15.prod.outlook.com>
I don't think CPU savings are a compelling argument, since they are due to implementation details and could be resolved in other ways. Bandwidth savings for proxy protocols like CONNECT-TCP also seem less compelling, since efficient transfer already requires processing data in chunks > 1 KB (often much bigger as seen by the proxy daemon), leaving < 1% potential savings. I'm more intrigued by the extreme cases of forwarding latency-sensitive data that is generated a few bytes at a time. This has become common lately in text-generator applications like ChatGPT, which stream "tokens" of <10 bytes each, one at a time. In these situations, the proportional overhead of the DATA frames is significant (>20%), even if the absolute total data transfer is small. We can also see similar usage in proposals to run remote shells over HTTP/3 [1]. Solutions specific to the Capsule Protocol would not obviously help in these cases. --Ben [1] https://blog.apnic.net/2024/02/02/towards-ssh3-how-http-3-improves-secure-shells/ ________________________________ From: Yaroslav Rosomakho <yrosomakho@zscaler.com> Sent: Thursday, October 9, 2025 4:19 PM To: Demi Marie Obenour <demiobenour@gmail.com> Cc: Kazuho Oku <kazuhooku@gmail.com>; Amos Jeffries <squid3@treenet.co.nz>; ietf-http-wg@w3.org <ietf-http-wg@w3.org> Subject: Re: [External⚠] Re: Unbound DATA frames in HTTP/3 proposal On Thu, Oct 9, 2025 at 11: 09 AM Demi Marie Obenour <demiobenour@ gmail. com> wrote: I would like to see proper benchmarks, though. This would certainly depend on the details and constraints of the implementation stack. Two contrasting examples On Thu, Oct 9, 2025 at 11:09 AM Demi Marie Obenour <demiobenour@gmail.com<mailto:demiobenour@gmail.com>> wrote: I would like to see proper benchmarks, though. This would certainly depend on the details and constraints of the implementation stack. Two contrasting examples on a TCP-to-HTTP/3 CONNECT proxy illustrate the range: When the proxy uses a well-integrated library or set of libraries that stream data transparently through kernel sockets and library layers, with 64Kb TCP segmentation offload and large enough buffers, the impact of HTTP/3 DATA framing is minimal. The savings are limited to a few bytes on the wire every few dozen packets. At the other extreme, when data is forwarded entirely in user space performing zero-copy transfers from a correctly ordered TCP packet into HTTP/3 CONNECT stream, the situation changes. If the implementation can reuse the received TCP header buffer to encode UDP and QUIC headers, the additional HTTP/3 DATA framing can determine whether a memory copy of the data is required. Depending on destination connection id length and size of varint encoded integers, the requirement to add HTTP/3 DATA framing could push the size of combined UDP/QUIC/HTTP/3 overhead beyond the size of the original TCP header, resulting in a memory copy of the proxied data. Depending on various other factors this may reduce throughput by ~5-10%. Some zero-copy architectures allow reserving per-packet headroom to mitigate this, but some don't. I'm sure a similar range exists with other applications encapsulating data into HTTP/3 streams. Best Regards, Yaroslav This communication (including any attachments) is intended for the sole use of the intended recipient and may contain confidential, non-public, and/or privileged material. Use, distribution, or reproduction of this communication by unintended recipients is not authorized. If you received this communication in error, please immediately notify the sender and then delete all copies of this communication from your system.
Received on Thursday, 9 October 2025 21:47:42 UTC