Re: Feedback on TCP Fast Open?

William,

Thanks for the explanation and especially about socket late binding. I
better appreciate some of the wrinkles.

On another note, it would be great if TFO also included a "jump start"
feature that would allow the TFO sender to start at a higher cwnd than the
usual initial cwnd for new connections.

Peter


On Sat, Aug 3, 2013 at 12:27 PM, William Chan (陈智昌)
<willchan@chromium.org>wrote:

> On Fri, Aug 2, 2013 at 10:34 PM, Willy Tarreau <w@1wt.eu> wrote:
>
>> Hi Amos,
>>
>> On Sat, Aug 03, 2013 at 12:05:15PM +1200, Amos Jeffries wrote:
>> > On 3/08/2013 11:19 a.m., Peter Lepeska wrote:
>> > >"it's unclear how beneficial it would
>> > >> be for us since we already have such gains for browser preconnect
>> (our
>> > >> browser feature that learns from past web browsing to speculatively
>> > >> establish connections, typically just TCP connections but perhaps
>> doing a
>> > >> TLS or other handshakes too as needed)"
>> > >
>> > >It would benefit any time you encounter a host that preconnect has not
>> > >learned about yet. Surely this provides a benefit.
>> >
>> > Yet assuming TFO does require that prior key exchange mentioned by Nico
>> > (I have not yet read the TFO spec). That means that preconnect will has
>> > also already been done right?
>> > So gains are 0 in that case.
>>
>> Not exactly, just that an earlier connection would have already been made
>> to the same site, so exactly as for preconnect as William described it
>> (since the browser decides to connect based on browsing history). So from
>> this point it is similar.
>>
>
> I think there's some confusion here. Let me try to clarify. Both TFO and
> preconnect rely on having visited a server before. When we know we
> speculatively want to establish a connection, there are some possibilities
> for the optimal outcome. If we speculatively do the TCP connect(), our
> browser web rendering engine may issue resource requests (HTTP GETs and
> what not) *after* the connect() completes. But in the cases where the
> rendering engine wants to fetch resources before the connect() completes,
> we may have been better off holding off on the normal pre-connect() and
> instead try to do a TFO connect with the GET in the payload of the SYN
> packet, which would let us issue the HTTP GET before we would have been
> able to in the preconnect case. So, it's not immediately obvious which
> situation is more likely. Also, TFO is not guaranteed to succeed, due to
> middleboxes and some normal operating situations. Since it's not known to
> be reliable, it means that we may conservatively want to prefer preconnect
> over TFO. TFO does have an extra benefit in that we generally try to err on
> the conservative side for speculative preconnects, and thus we may not
> preconnect enough sockets, which leaves room for TFO to provide extra
> benefit.
>
> Also, TFO doesn't play as nicely with our socket late binding code in
> Chromium. See
> https://insouciant.org/tech/connection-management-in-chromium/#socket_late_bindingfor an explanation of this feature. Late binding basically means we delay
> the binding of a HTTP transaction to a concrete socket until the socket is
> known to be ready to accept a HTTP transaction. The naive thing to do is,
> for each HTTP transaction, tie it to a socket that is stuck in connect(),
> and when that completes, you issue the HTTP GET. But we treat all connected
> sockets equally and assign HTTP transactions to connected sockets in HTTP
> transaction priority order (e.g. HTML is higher priority than JPGs). Now,
> if you think about it, TFO doesn't play well with this late binding
> approach, since it eagerly binds the application payload to the socket,
> even if the TFO connect may fail and gracefully fallback to a normal
> connect. If I knew it was going to fail, it may have been better to have
> waited to send the HTTP GET over another socket that became available
> slightly later, but I knew was going to succeed (e.g. a reused persistent
> connection, or a new normally connected TCP socket).
>
> This is all rather complicated and so all I'm saying is it's complicated
> to determine when TFO is useful in the normal http:// use case. However,
> it's definitely possible we can leverage it to improve matters. And in the
> https:// use case, I think it's a clear win.
>
>
>> > Its main benefit seems to be allowing for prefetching to be skipped or
>> > short-circuited if it is used as the initial step of such prefetch.
>>
>> The benefit I'm seeing mainly is not to maintain idle connections to
>> servers which do not know if they can kill them or not. However we don't
>> know if the TFO SYN will pass, which is a real issue (hence my proposal to
>> use TFO in preconnect and to send a harmless request such as GET favicon).
>>
>
> Ah, the intranet use case is definitely an interesting one. I've only been
> thinking about the client case over public networks so far, but I agree
> there's probably potential to reduce persistent connection overhead within
> a private network where you don't have to worry about TFO interop issues
> due to middleboxes. The public network (internet) use case is definitely
> harder and there are some tradeoffs with the harmless request approach.
>
>
>> > >It also has lower cost since it will result in fewer (zero) connection
>> > >mistakes since it's not doing anything speculatively. Don't get me
>> > >wrong I'm a big fan of the benefits of speculative prefetching in
>> > >general, but only in the case where the underlying protocol can't
>> > >solve the problem without mistakes.
>> > >
>> > >TCP Fast Open seems great as long as it doesn't introduce any other
>> > >problems (such as increased DoS vulnerability).
>> >
>> > It is clearly lowering the capacity barrier SYN-flood DDoS need to reach
>> > in exchange for 1 RTT on legitimate TCP setup.
>> > The big question though is; overall which are more common: DDoS
>> > SYN-flood packets or legitimate SYN?
>>
>> It depends. I've been involved in dealing with multi-10G SYN flood attacks
>> on some large infrastructures which surprizingly did not get that many
>> connections normally. So if we do the math, 30-40 Mpps for 4 hours is
>> about 500 billion SYN, which is about 200k SYN/s over one month. So such
>> an attack for 4 hours every month produces 10 times more SYNs than clients
>> running at 20k conn/s. But what matters really is the user experience.
>>
>> That said, I'm not that much afraid of SYN floods with TFO at the moment,
>> for multiple reasons :
>>   - attackers still don't send TFO and will probably not do for a long
>>     time since it reduces the reach probability
>>   - normal SYN floods are already enough
>>   - one of the benefits of TFO is that you don't have to respond to
>> validate
>>     the packet. So you have everything in it to decide whether it's valid
>> or
>>     not. OK it takes CPU time, but not much more than building a SYN
>> cookie
>>     and sending it back.
>>
>> I also remember when experimenting with TFO on Linux that sometimes it
>> stopped
>> accepting it when I was flooding it, I suspect that if the failure rate
>> is too
>> high, it might decide to disable it, but I'm not sure about this since it
>> was
>> a very early implementation (so it might as well have been a bug).
>>
>> Best regards,
>> Willy
>>
>>
>>
>

Received on Saturday, 3 August 2013 18:39:48 UTC