- From: Stefan Eissing <stefan.eissing@greenbytes.de>
- Date: Fri, 16 Jul 2021 09:51:01 +0200
- To: Willy Tarreau <w@1wt.eu>
- Cc: Poul-Henning Kamp <phk@phk.freebsd.dk>, Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>, Toerless Eckert <tte@cs.fau.de>, Mark Nottingham <mnot@mnot.net>, IETF QUIC WG <quic@ietf.org>, HTTP Working Group <ietf-http-wg@w3.org>
> Am 16.07.2021 um 09:16 schrieb Willy Tarreau <w@1wt.eu>: > > On Fri, Jul 16, 2021 at 06:34:32AM +0000, Poul-Henning Kamp wrote: >> -------- >> Willy Tarreau writes: >> >>> Stefan made a good point about the problem that might result, with >>> inbound load balancing between multiple listeners (typically what's >>> achieved by L3 switches doing L3+L4 hash between multiple servers, >>> and operating systems hashing the source+destination port to pick a >>> different listening socket). Thus a suggestion might be to possibly >>> save resources by using a small amount of sockets, with "small" left >>> to the appreciation of the implementation. >> >> We should run the question "few or many UDP ports?" past some some >> friendly 100G and 400G device driver maintainers. >> >> The NICs I have been working with for the ESO ELT project all >> included UDP ports in the hash they used to decide which CPU core >> to deliver packets/interrupts to and we had to spread across UDP >> ports to or all the traffic would hit one single core. >> >> I dont know if QUIC has registered with the NIC designers and device >> drivers writers yet, but given the opacity of QUIC packets, it is >> very hard to see what else than the UDP port they can feed into >> their hash, so I expect the answer to be "many". > > That's true but queues are not the only parameter, as using many > sockets also results in a massive cache miss ratio and many more > syscalls. Even with TCP it's quite visible that using a few hundred > connections tends to be incredibly more efficient at 100Gbps than > using tens of thousands. > > Ideally we want all queues to be used with a good balance while > limiting the amount of operations to transport these data. > > Probably that some wording evocating that possibility offered by > the protocol and a suggestion to take various factors into account > (L4 LB, NIC queues, listeners, syscalls etc) is sufficient to let > implementors ask the relevant people for advice in their context. It seems a good idea to actively pull in people knowledgable in these implementation and hear their recommendations/expectations. As far as my limited knowledge goes, having a QUIC application share a single source port for many QUIC connections does not pose any problems for servers. The concern is more about NAT (and esp. CGNAT) to coalesce many Source:address+port into a few NAT-address+ports, using QUIC connection IDs. On the extreme, once these NATs use connection IDs, they could theoretically live with a single NAT port. That would not work well with any UDP load balancing. The advice for QUIC aware NAT implementations could then be something like (IANAUBE): - distribute QUIC connections across a minimum of N ports - The N ports would ideally be random or a complete interval (they should not all give the same number module a prime) - distribute new NATed connections in some random fashion or round-robin across your port pool. - Stefan
Received on Friday, 16 July 2021 07:51:19 UTC