Shouldn't we add a new CLOSING state in H2 ? from Willy Tarreau on 2019-01-24 (ietf-http-wg@w3.org from January to March 2019)

From: Willy Tarreau <w@1wt.eu>
Date: Thu, 24 Jan 2019 15:06:56 +0100
To: ietf-http-wg@w3.org
Message-ID: <20190124140656.GC20924@1wt.eu>
Hi,

I've recently got an interesting problem reported about haproxy's H2
implementation which could possibly affect other implementations to
some extents. At least I think it is in part related to the issue
about the dependency tree being racy here :

   https://github.com/httpwg/http-extensions/issues/751

The problem is that from time to time some people are getting some TCP
resets at the end of a transfer during an haproxy process restart. I
analysed the situation a little bit and figured what can cause this.

In short, the client sends a HEADERS frame with ES flag set. The stream
is seen as half-closed(remote). Let's say it's a GET for a large object.
Haproxy passes it to the server, receives the response and starts to
stream the response to the client. During the transfer, haproxy gets a
signal indicating that it's going to be reloaded and must terminate its
connections ASAP. It then sends a GOAWAY with ID=1 to the client along
with the data which continue to be streamed, informing the client that
it doesn't want to receive new requests on this connection. Once it
finishes to transfer the large response, the last DATA frame carries
the ES flag, and the stream switches to CLOSED.

Since the connection no longer has any stream and already sent a GOAWAY
a long time ago to perform a graceful shutdown, it can naturally close.

Except that the client continues to send WINDOW_UPDATE frames back...
These frames will eventually meet a closed TCP connection, resulting
in a TCP reset often causing some payload truncation, especially with
the data still lying in the system's socket buffers (hence why it mosly
affects large objects).

We could imagine a lot of dirty time-based workarounds for this, but
they would only steer the problem to another area.

On TCP an exactly similar situation exists and is covered by the LAST_ACK
state : the connection remains in LAST_ACK until the FIN is acknowledged.

In my opinion a good approach here would consist in implementing a CLOSING
state before the CLOSED state as long as we know we have data in flight,
and only switching to CLOSED once enough WINDOW_UPDATE were received.

The frames causing this problem are essentially DATA frames, because these
ones lead to WINDOW_UPDATE frames being sent back. These ones are almost
ACKs. I'm saying "almost" because they can also be used to bump the window
at once. But at least we know we're expected to continue to receive such
frames until the window has not grown by at least the number of bytes
sent since the last update. And in all situations where the stream window
doesn't change it would even be an exact match.

This could also help address some of the races that exist in error handling
for frames received on closed streams since in general the stream is not
available anymore to know whether it was closed with ES or RST_STREAM for
example.

A possible improvement to the approach above could be to add a new setting
mentioning that WINDOW_UPDATE=0 is supported, and we'd use this to indicate
that we've received the ES bit or RST_STREAM for the associated stream. It
could possibly even be extended to the connection itself, by indicating that
we've reached the end of the last stream advertised in GOAWAY since in this
case we know the other side is not going to send anything anymore.

Does anyone here think there would be any interest in working on such a
thing ? I'm personally interested in trying to stuff something like this
into haproxy. But while the passive monitoring of WINDOW_UPDATE might work
without cooperation from the other side, something more robust like an
extension for WU=0 would need support from other agents, so if nobody
is interested it's not worth working on it. That's why I'm interested in
gathering opinions.

Thanks,
Willy
Received on Thursday, 24 January 2019 14:07:20 UTC