Re: Feedback on TCP Fast Open?

On Fri, Aug 2, 2013 at 10:34 PM, Willy Tarreau <w@1wt.eu> wrote:

> Hi Amos,
>
> On Sat, Aug 03, 2013 at 12:05:15PM +1200, Amos Jeffries wrote:
> > On 3/08/2013 11:19 a.m., Peter Lepeska wrote:
> > >"it's unclear how beneficial it would
> > >> be for us since we already have such gains for browser preconnect (our
> > >> browser feature that learns from past web browsing to speculatively
> > >> establish connections, typically just TCP connections but perhaps
> doing a
> > >> TLS or other handshakes too as needed)"
> > >
> > >It would benefit any time you encounter a host that preconnect has not
> > >learned about yet. Surely this provides a benefit.
> >
> > Yet assuming TFO does require that prior key exchange mentioned by Nico
> > (I have not yet read the TFO spec). That means that preconnect will has
> > also already been done right?
> > So gains are 0 in that case.
>
> Not exactly, just that an earlier connection would have already been made
> to the same site, so exactly as for preconnect as William described it
> (since the browser decides to connect based on browsing history). So from
> this point it is similar.
>

I think there's some confusion here. Let me try to clarify. Both TFO and
preconnect rely on having visited a server before. When we know we
speculatively want to establish a connection, there are some possibilities
for the optimal outcome. If we speculatively do the TCP connect(), our
browser web rendering engine may issue resource requests (HTTP GETs and
what not) *after* the connect() completes. But in the cases where the
rendering engine wants to fetch resources before the connect() completes,
we may have been better off holding off on the normal pre-connect() and
instead try to do a TFO connect with the GET in the payload of the SYN
packet, which would let us issue the HTTP GET before we would have been
able to in the preconnect case. So, it's not immediately obvious which
situation is more likely. Also, TFO is not guaranteed to succeed, due to
middleboxes and some normal operating situations. Since it's not known to
be reliable, it means that we may conservatively want to prefer preconnect
over TFO. TFO does have an extra benefit in that we generally try to err on
the conservative side for speculative preconnects, and thus we may not
preconnect enough sockets, which leaves room for TFO to provide extra
benefit.

Also, TFO doesn't play as nicely with our socket late binding code in
Chromium. See
https://insouciant.org/tech/connection-management-in-chromium/#socket_late_bindingfor
an explanation of this feature. Late binding basically means we delay
the binding of a HTTP transaction to a concrete socket until the socket is
known to be ready to accept a HTTP transaction. The naive thing to do is,
for each HTTP transaction, tie it to a socket that is stuck in connect(),
and when that completes, you issue the HTTP GET. But we treat all connected
sockets equally and assign HTTP transactions to connected sockets in HTTP
transaction priority order (e.g. HTML is higher priority than JPGs). Now,
if you think about it, TFO doesn't play well with this late binding
approach, since it eagerly binds the application payload to the socket,
even if the TFO connect may fail and gracefully fallback to a normal
connect. If I knew it was going to fail, it may have been better to have
waited to send the HTTP GET over another socket that became available
slightly later, but I knew was going to succeed (e.g. a reused persistent
connection, or a new normally connected TCP socket).

This is all rather complicated and so all I'm saying is it's complicated to
determine when TFO is useful in the normal http:// use case. However, it's
definitely possible we can leverage it to improve matters. And in the
https:// use case, I think it's a clear win.


> > Its main benefit seems to be allowing for prefetching to be skipped or
> > short-circuited if it is used as the initial step of such prefetch.
>
> The benefit I'm seeing mainly is not to maintain idle connections to
> servers which do not know if they can kill them or not. However we don't
> know if the TFO SYN will pass, which is a real issue (hence my proposal to
> use TFO in preconnect and to send a harmless request such as GET favicon).
>

Ah, the intranet use case is definitely an interesting one. I've only been
thinking about the client case over public networks so far, but I agree
there's probably potential to reduce persistent connection overhead within
a private network where you don't have to worry about TFO interop issues
due to middleboxes. The public network (internet) use case is definitely
harder and there are some tradeoffs with the harmless request approach.


> > >It also has lower cost since it will result in fewer (zero) connection
> > >mistakes since it's not doing anything speculatively. Don't get me
> > >wrong I'm a big fan of the benefits of speculative prefetching in
> > >general, but only in the case where the underlying protocol can't
> > >solve the problem without mistakes.
> > >
> > >TCP Fast Open seems great as long as it doesn't introduce any other
> > >problems (such as increased DoS vulnerability).
> >
> > It is clearly lowering the capacity barrier SYN-flood DDoS need to reach
> > in exchange for 1 RTT on legitimate TCP setup.
> > The big question though is; overall which are more common: DDoS
> > SYN-flood packets or legitimate SYN?
>
> It depends. I've been involved in dealing with multi-10G SYN flood attacks
> on some large infrastructures which surprizingly did not get that many
> connections normally. So if we do the math, 30-40 Mpps for 4 hours is
> about 500 billion SYN, which is about 200k SYN/s over one month. So such
> an attack for 4 hours every month produces 10 times more SYNs than clients
> running at 20k conn/s. But what matters really is the user experience.
>
> That said, I'm not that much afraid of SYN floods with TFO at the moment,
> for multiple reasons :
>   - attackers still don't send TFO and will probably not do for a long
>     time since it reduces the reach probability
>   - normal SYN floods are already enough
>   - one of the benefits of TFO is that you don't have to respond to
> validate
>     the packet. So you have everything in it to decide whether it's valid
> or
>     not. OK it takes CPU time, but not much more than building a SYN cookie
>     and sending it back.
>
> I also remember when experimenting with TFO on Linux that sometimes it
> stopped
> accepting it when I was flooding it, I suspect that if the failure rate is
> too
> high, it might decide to disable it, but I'm not sure about this since it
> was
> a very early implementation (so it might as well have been a bug).
>
> Best regards,
> Willy
>
>
>

Received on Saturday, 3 August 2013 16:27:29 UTC