Re: [tcpm] TCP Tuning for HTTP - update

On 8/17/2016 11:08 AM, Willy Tarreau wrote:
> On Wed, Aug 17, 2016 at 08:14:08AM -0700, Joe Touch wrote:
>> There are many other sites - and books - that already indicate how to
>> configure systems efficiently.
>>
>> So if your argument is that a man page summary is needed, sure - but
>> again, is a new one needed? And why is this then needed as an RFC?
> The difference is that a man page is OS-specific while an RFC gets a
> unique number and serves as a reference. 
That means the OS-specific stuff here needs to be removed...

> It can be cited in new RFCs
> to justify certain choices. 
Hmm. Like the refs I gave could be cited in this doc to justify *its*
choices? :-)
...
>>> Also, I don't know if there have been any update, but these documents use
>>> SunOS 4.1.3 running on a sparc 20 as a reference. While I used to love
>>> working on such systems 20 years ago, they predate the web era and systems
>>> have evolved a lot since to deal with high traffic. ...
>> Yes, and discussing those issues would be useful - but not in this
>> document either.
> Why ? Lots of admins don't understand why the time_wait timeout remains
> at 240 seconds on Solaris with people saying "if you want to be conservative
> don't touch it but if you want to be modern simply shrink it to 30 seconds
> or so". People need to understand why advices have changed over 3 decades.

The advice hasn't really changed - the advice was given in the 99 ref,
which includes some cases where it can still be appropriate to decrease
that timer.

Again, though - the 99 ref explains the implications of both decisions -
why it's 240 seconds and what happens when it's lowered.

>
>>> So you need to expect that only researchers and maybe TCP stack developers
>>> will find your work useful these days, server admins can hardly use this
>>> anymore. However it is very possible that some TCP stacks have taken benefit
>>> of your work to reach the level of performance they achieve right now, I
>>> don't know. Thus I think that Daniel's work completes quite well what you've
>>> done in that it directly addresses people's concerns without requiring the
>>> scientific background.
>> Let me see if I get your complete argument:
>>
>>     - the appropriate refs are 20 years old
>>     - server admins need a doc
>>
>> What exactly do server admins need regarding Nagle (which is configured
>> inside the app already), socket sizing (configured inside the app), etc?
> Lots of things : 
>   - time_wait tuning (which everybody gives different advices on, I've
>     even seen firewall vendors recommend to shrink it to one second because
>     it allowed their product to perform better in benchmarks)
Again, the 99 ref gives that detail.
>   - TCP timestamps: what they provide, what are the risks (some people in
>     banking environments refuse to enable them so that they cannot be used
>     as an oracle to help in timing attacks).
That's already covered in the security considerations of RFC 7323. How
is HTTP different, if at all, from any other app?
>   - window scaling : how much is needed.
Same issue here, same ref - how is HTTP different?

>   - socket sizing : contrary to what you write, there's a lot of tuning
>     on the web where people set the default buffer sizes to 16MB without
>     understanding the impacts when dealing with many sockets
There's a whole book that encompasses that and some related issues:
http://www.onlamp.com/pub/a/onlamp/2005/11/17/tcp_tuning.html

Some advice is also given in Sec 6.3.3 of this:
J. Sterbenz, J. Touch, /High Speed Networking: A Systematic Approach to
High-Bandwidth Low-Latency Communication/
<http://www.wiley.com/WileyCDA/WileyTitle/productCd-0471330361.html>,
Wiley, May 2001.

>   - SACK : why it's better. DSACK what it adds on top of SACK.
That's in the SACK docs, which aren't cited. Again, how is HTTP
different from any app?
>   - ECN : is it needed ? does it really work ? where does it cause issues ?
That's in the ECN docs, which aren't cited. Again, how is HTTP different
from any app?
>   - SYN cookies : benefits, risks
That's in the RFC 4987, which at least IS cited. Again, how is HTTP
different from any app?

>   - TCP reuse/recycling : benefits, risks
Not sure what you mean here. There are a lot of docs on the issues with
persistent-HTTP vs per-connection HTTP.

>   - dealing with buffer bloat : tradeoffs between NIC-based acceleration
>     and pacing
Bufferbloat typically involves large *uplink* transfers and how they
interact with other uplink connections. Neither TCP nor HTTP is involved
in this really.

>   - what are orphans and why you should care about them in HTTP close mode
Orphaned TCP connections or orphaned HTTP processes?

>   - TCP fastopen : how does it work, what type of workload is improved,
>     what are the risks (ie: do not enable socket creation without cookie
>     by default just because you find it reduces your server load)

Another doc that exists.

>   - whether to choose a short or a large SYN backlog depending on your
>     workload (ie: do you prefer to process everything even if the dequeuing
>     is expensive or to drop early in order to recover fast).
>
> ... and probably many other that don't immediately come to my mind. None
> of these ones was a real issue 20 years ago.

See above. Many were known around that time, but weren't documented in
detail (it took a while for a proper ref to SYN cookies, and the book I
wrote with Sterbenz came about because we'd seen wheels being
rediscovered for 15 years).

>  All of them became issues for
> many web server admins who just copy-paste random settings from various
> blogs found on the net who just copy the same stupidities over and over
> resulting in the same trouble being caused to each of their reader.

This doc is all over the place.

If you want a doc to advise web admins, do so. But most of the items
above aren't under admin control; they're buried in app and OS
implementations, and most have already evolved do to the right thing.

I agree that a summary of a *focused set* of these might be useful *as a
review* (note that review discussions usually include lots of refs). The
key question is "what is the focus":

    - HTTP/TCP interactions
    - server administration advice
    - ?

IMO, RFCs should focus on issues specific to the protocols and their
deployment - not general knowledge that exists in courses and textbooks.

>> I.e., at the most this is a man page (specific to an OS). At the least,
>> this isn't useful at all.
> As you can see above, nothing I cited was OS-specific but only workload
> specific. That's why I think that an informational RFC is much more suited
> to this than an OS-specific man page. The OS man page may rely on the RFC
> to propose various tuning profiles for different workloads however.

You have a good point that this is general info, but OS issues are not
in the scope of the IETF and there are courses and books that already
provide good advice on efficiently running web (and other) servers.

Joe

Received on Wednesday, 17 August 2016 18:32:45 UTC