- From: Patrick McManus <mcmanus@ducksong.com>
- Date: Thu, 3 Mar 2016 15:43:38 -0500
- To: Willy Tarreau <w@1wt.eu>
- Cc: Joe Touch <touch@isi.edu>, HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAOdDvNokUDxmfy87VrQNLoQvQknP6L3h6fLbuFeVpOiDN4szAQ@mail.gmail.com>
Hi Wily, Joe, This message is a bit of a diversion from the discussion so far. sorry bout that. On Thu, Mar 3, 2016 at 1:44 PM, Willy Tarreau <w@1wt.eu> wrote: > I've seen people > patch their kernels to lower the TIME_WAIT down to 2 seconds to address > such shortcomings! Quite frankly, this workaround *is* causing trouble! > really? That's fascinating to me. Can you provide background or citations on what kind of trouble has been attributed to this and the scenario where it was done? You don't need to go through the theoretical - I know what TW could conceptually catch - but the assertion about the shorter timeout causing field problems is something I'd love to understand better. For TW to be useful protection it also has to be paired with re-transmission and some application states that will be impacted by the screwup which reduces its utility, particularly for HTTP. I read the above statement as saying that TW is indeed useful in the field at the application layer - but maybe it is referencing some side effect I'm not thinking of rather than the vulnerability of not using it. Those kinds of post mortem war stories where it is seen in the field are pretty interesting and help inform the discussion about whether the cure is worse than the disease. My inclination has generally been that TW doesn't help a lot in practice and has some limitations and causes pain (as well documented in this thread.). So it would be interesting to look at the fallout of a situation it could have helped with. This feels a bit like the musing over the subpar utility of the tcp checksum on high bandwidth networks. For that one, the answer at least in the http space is 'use https for integrity and sort out the rare error at the application level'. I'm wondering if that's the right advice in the time_wait space for http as well.. we're really still talking about integrity. Go ahead and turn it off - just make sure you're running a higher level protocol that won't confuse old data with new data. (Fun paper: http://conferences2.sigcomm.org/imc/2015/papers/p303.pdf showed that at the tail of a ping survey, 1 % of replies from 1% of addresses needed >= 145 seconds to arrive. And that's just delay - not retransmission. A truly protective TW is a very large number.)
Received on Thursday, 3 March 2016 20:44:03 UTC