Re: Feedback on TCP Fast Open? from Willy Tarreau on 2013-08-03 (ietf-http-wg@w3.org from July to September 2013)

From: Willy Tarreau <w@1wt.eu>
Date: Sat, 3 Aug 2013 23:34:52 +0200
To: "William Chan (?????????)" <willchan@chromium.org>
Cc: Amos Jeffries <squid3@treenet.co.nz>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20130803213452.GD888@1wt.eu>
Hi William,

On Sat, Aug 03, 2013 at 09:27:00AM -0700, William Chan (?????????) wrote:
> On Fri, Aug 2, 2013 at 10:34 PM, Willy Tarreau <w@1wt.eu> wrote:
> 
> > Hi Amos,
> >
> > On Sat, Aug 03, 2013 at 12:05:15PM +1200, Amos Jeffries wrote:
> > > On 3/08/2013 11:19 a.m., Peter Lepeska wrote:
> > > >"it's unclear how beneficial it would
> > > >> be for us since we already have such gains for browser preconnect (our
> > > >> browser feature that learns from past web browsing to speculatively
> > > >> establish connections, typically just TCP connections but perhaps
> > doing a
> > > >> TLS or other handshakes too as needed)"
> > > >
> > > >It would benefit any time you encounter a host that preconnect has not
> > > >learned about yet. Surely this provides a benefit.
> > >
> > > Yet assuming TFO does require that prior key exchange mentioned by Nico
> > > (I have not yet read the TFO spec). That means that preconnect will has
> > > also already been done right?
> > > So gains are 0 in that case.
> >
> > Not exactly, just that an earlier connection would have already been made
> > to the same site, so exactly as for preconnect as William described it
> > (since the browser decides to connect based on browsing history). So from
> > this point it is similar.
> >
> 
> I think there's some confusion here. Let me try to clarify. Both TFO and
> preconnect rely on having visited a server before.

Till this point I can confirm there was no confusion, that's precisely
what I understood.

> When we know we
> speculatively want to establish a connection, there are some possibilities
> for the optimal outcome. If we speculatively do the TCP connect(), our
> browser web rendering engine may issue resource requests (HTTP GETs and
> what not) *after* the connect() completes. But in the cases where the
> rendering engine wants to fetch resources before the connect() completes,
> we may have been better off holding off on the normal pre-connect() and
> instead try to do a TFO connect with the GET in the payload of the SYN
> packet, which would let us issue the HTTP GET before we would have been
> able to in the preconnect case. So, it's not immediately obvious which
> situation is more likely. Also, TFO is not guaranteed to succeed, due to
> middleboxes and some normal operating situations. Since it's not known to
> be reliable, it means that we may conservatively want to prefer preconnect
> over TFO.

Agreed, and that's why I proposed to use TFO *for* the preconnect : it
could be used at least to get some stats about the success rate, and if
it fails it's harmless since noone uses the connection yet. And it saves
resources, you save one packet in each direction over the wire, which is
always good especially in mobile networks when the pipe is already used
by outgoing GET for totally unrelated things.

> TFO does have an extra benefit in that we generally try to err on
> the conservative side for speculative preconnects, and thus we may not
> preconnect enough sockets, which leaves room for TFO to provide extra
> benefit.

Interesting mix, indeed (provided the TFO success rate is high enough,
of course).

> Also, TFO doesn't play as nicely with our socket late binding code in
> Chromium. See
> https://insouciant.org/tech/connection-management-in-chromium/#socket_late_bindingfor
> an explanation of this feature. Late binding basically means we delay
> the binding of a HTTP transaction to a concrete socket until the socket is
> known to be ready to accept a HTTP transaction. The naive thing to do is,
> for each HTTP transaction, tie it to a socket that is stuck in connect(),
> and when that completes, you issue the HTTP GET. But we treat all connected
> sockets equally and assign HTTP transactions to connected sockets in HTTP
> transaction priority order (e.g. HTML is higher priority than JPGs). Now,
> if you think about it, TFO doesn't play well with this late binding
> approach, since it eagerly binds the application payload to the socket,
> even if the TFO connect may fail and gracefully fallback to a normal
> connect. If I knew it was going to fail, it may have been better to have
> waited to send the HTTP GET over another socket that became available
> slightly later, but I knew was going to succeed (e.g. a reused persistent
> connection, or a new normally connected TCP socket).

Hmmm good point, I didn't think about this. But now it reminds me that in
haproxy I have not yet implemented the connection to the server using TFO
for the same reason of change of sequencing of events (even if it works
differently from what you described). My issue was that while connect
errors are retryable, send errors are not (idempotence etc...) since they're
always performed on a valid connection. And dealing with a failed send()
which is used to connect is something that will need some tricky changes.

> This is all rather complicated and so all I'm saying is it's complicated to
> determine when TFO is useful in the normal http:// use case. However, it's
> definitely possible we can leverage it to improve matters.

Yes, I think it's not as obvious as what it initially looks like.

> And in the https:// use case, I think it's a clear win.

Indeed.

> > > Its main benefit seems to be allowing for prefetching to be skipped or
> > > short-circuited if it is used as the initial step of such prefetch.
> >
> > The benefit I'm seeing mainly is not to maintain idle connections to
> > servers which do not know if they can kill them or not. However we don't
> > know if the TFO SYN will pass, which is a real issue (hence my proposal to
> > use TFO in preconnect and to send a harmless request such as GET favicon).
> >
> 
> Ah, the intranet use case is definitely an interesting one. I've only been
> thinking about the client case over public networks so far, but I agree
> there's probably potential to reduce persistent connection overhead within
> a private network where you don't have to worry about TFO interop issues
> due to middleboxes.

In fact the issue can happen there as well, but then there is an admin who
will deploy the setting to disable the feature on all PCs and the issue is
gone everywhere at once.

> The public network (internet) use case is definitely harder and there
> are some tradeoffs with the harmless request approach.

Again, please try to imagine if it would help to get /favicon.ico to establish
a pre-connection. It would be exactly the same cost as the current pre-connect,
be more friendly to the servers and harmless in case of failure.

Best regards,
Willy
Received on Saturday, 3 August 2013 21:36:49 UTC