Re: On the abuse of chunking for interactive usages

I think you're generally right.

See also:

Suggestions for text welcome.

On 11/05/2011, at 7:26 AM, Willy Tarreau wrote:

> Hi,
> a few days ago, a user of haproxy explained to me that he was experiencing
> extremely long delays between two servers when communicating through haproxy
> while the two servers in direct had no issue.
> He was kind enough to provide logs and network traces with a bit of explanation
> on the data exchanges.
> What is happening is that one server connects to the other one using a PUT
> request, makes use of chunked encoding to send data and the second one sends
> a chunked encoded response in turn. The protocol is specified here :
> The issue comes from the fact that the protocol assumes that messages are
> fully interactive and that whatever chunk is emitted on one side is
> immediately received on the other side and conversely. So each chunk
> serves as an ACK for the other one, making the workload consist in
> many very small TCP segments (around 10-20 bytes of payload each).
> Obviously this can only work on local networks with extremely low RTT.
> The issue comes when a gateway is inserted in the middle. Haproxy was built
> on the assumption that messages live their own life and that there is no
> direct relation between a chunk on one side and a chunk on the other side.
> And in order not to flood the receiver with TCP PUSH packets, it aggregates
> as many data as possible in each segment (MSG_MORE, equivalent on TCP_CORK)
> until the last chunk is seen.
> What was causing the massive slowdown is that for each 10-bytes payload seen,
> haproxy was telling the system "hey, please hold on for a while, something
> else is coming soon". The system waits for 200ms, and seeing nothing else
> come, finally sends the chunk. The same happens on the other direction,
> resulting in only one req/resp being exchanged every 400 ms.
> The workaround, which the user confirmed fixed the issue for him, consists
> in sending all chunks as fast as possible (TCP_NODELAY). But doing this
> by default makes very inefficient use of mobile networks for normal uses,
> especially with compressed traffic which generally is chunked. The issue
> is now that each chunk will be sent with the TCP PUSH flag which the client
> has to immediately ACK, resulting in a massive slowdown due to uplink
> congestion during downloads.
> I can also improve the workaround so that haproxy asks the system to wait
> only when there are incomplete chunks left, but still this will not cover
> the mobile case in a satisfying way. So I'm now tempted to add an option
> to let the user decide whether he makes (ab)use of chunking or not.
> My concern comes from this specific use of chunking. I see no reason why
> this would be valid. I know it will not work at many places. Some proxies
> (such as nginx IIRC) buffer the complete request before passing it on. And
> in fact many other ones might want to analyse the beginning of the data
> before deciding to let it pass through. Also I don't see why we should
> accept to turn each chunk into a TCP segment of its own, this seems
> contrary to the principle of streamed messages.
> My understanding has always been that the only feature that an intermediary
> could guarantee is that one all the request body has been transferred, it
> will let all the response body pass.
> Am I wrong somewhere ? Shouldn't we try to remind implementers that there
> is no guarantee of any type of interactivity between two opposite streams
> being transferred over the same connection ? I'm worried by the deviations
> from the original use. In fact the project above seems to have tried to
> implement websocket before it was available. But the fact that some people
> do this probably means the spec makes think this is something that can be
> expected to work.
> Any insights are much appreciated. I've not yet committed on a fix, and I'm
> willing to consider opinions here to find the fairest solution for this
> type of usage without unduly impacting normal users.
> Thanks,
> Willy

Mark Nottingham

Received on Wednesday, 11 May 2011 05:28:34 UTC