- From: Eggert, Lars <lars@netapp.com>
- Date: Mon, 15 Apr 2013 21:03:53 +0000
- To: Gabriel Montenegro <Gabriel.Montenegro@microsoft.com>
- CC: Roberto Peon <grmocg@gmail.com>, "Simpson, Robby (GE Energy Management)" <robby.simpson@ge.com>, Eliot Lear <lear@cisco.com>, Robert Collins <robertc@squid-cache.org>, Jitu Padhye <padhye@microsoft.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, "Brian Raymor (MS OPEN TECH)" <Brian.Raymor@microsoft.com>, Rob Trace <Rob.Trace@microsoft.com>, "Dave Thaler" <dthaler@microsoft.com>, Martin Thomson <martin.thomson@skype.net>, Martin Stiemerling <martin.stiemerling@neclab.eu>
Hi, I had commented in an off-list discussion on this issue and was asked to summarize what I said to the list. So here we go. I fully understand why the idea to bypass slow-start and instead start with the window used during the last connection instantiation sounds nice. But: it has been thought of before a dozen times and has huge issues. This has the potential to generate large line-rate bursts into the network, which can can cause loss bursts and force TCP into timeout-based recovery, which has a huge impact on throughput. (Much more so than slow-starting with a smaller window.) That is, because you normally have no idea if the path conditions are at all comparable between when you cached that CWND and when you want to reuse it. So when you burst and create a series of losses - for yourself and other flows on the bottleneck! - they all go into timeout of a few hundred ms at least and then slow-start. The TCP WG has been working on the Google "IW10" proposal (allowing TCP to start with an initial window of 10 segments rather than 1-3). That seems to mitigate much of the need for caching the CWND, since new connections wouldn't need to start with very small windows. A large part of the discussion around that proposal was exactly on the question of how large the initial window can be without significantly increasing the danger of line-rate bursts. There has been a pretty in-depth analysis by multiple folks into whether 10 is safe or not, and the consensus seems to be that it should be. Just caching and reusing any arbitrarily large CWND is certainly not safe. The issue itself has been thought about for much longer, c.f.http://tools.ietf.org/html/draft-hughes-restart-00 from 2002, which talks about the issue of what the window should be after a connection has been idle for a while and wants to resume sending. Another related work item in TCP is http://tools.ietf.org/html/draft-ietf-tcpm-newcwv-00, which attempts to specify what TCP should do during periods where it didn't send at a rate that used up the current window, which can also lead to bursting when traffic demands increase. I'm mentioning this, because a lot of the work of the TCP WG revolves around mitigating these bursts in order to avoid stalls due to timeout-based recovery, and having HTTP go off and define knobs that would actively counteract that work seems, ahem, counterproductive. I'm all for making HTTP and TCP work better together. Limiting the number of parallel connections, seeing if we can increase the initial window safely, and other similar things are all great examples of what we should be doing more of. But the TCP and HTTP folks will need to work together on this - we can't afford to get this wrong. Lars PS: I'm not on the WG list, so please CC me if you'd like to respond.
Received on Monday, 22 April 2013 08:06:36 UTC