Re: p1-message-07 S 7.1.4 from Jamie Lokier on 2009-07-21 (ietf-http-wg@w3.org from July to September 2009)

From: Jamie Lokier <jamie@shareable.org>
Date: Tue, 21 Jul 2009 13:50:03 +0100
To: Adrien de Croy <adrien@qbik.com>
Cc: Mark Nottingham <mnot@mnot.net>, Henrik Nordstrom <henrik@henriknordstrom.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20090721125003.GF20756@shareable.org>

Adrien de Croy wrote:
> 
> is this for backbone and infrastructure traffic you mean?
> 
> In which case, removing spam would be a good start.  Effectively at 
> least double the width of all pipes.

Last time I checked, the big backbones kept traffic at about 5% of the
available bandwidth.  That means 95% unused.  It's for a reason.

It turns out that as you get closer to filling a pipe, the average and
peak _delays_ increase enormously and it's not possible to do things
like internet telephony and video chat...

In principle, QoS (quality of service) is a whole world of mechanisms
to permit the full bandwidth of a pipe to be used, while allocating
other factors such as latency and loss statistics to connections which
needed.

But QoS is really very difficult to make work.  It's easier and
cheaper to have big pipes with 95% spare capacity.

It's a statistical thing...

> I would have thought slow-startup algorithms would also work against the 
> advantage of opening too many connections.

Not really, because TCP slow-start runs independently for each
connection.  Slow-start lets you build up the data rate until you hit
the congestion window where you start getting packet loss.  If you
have many connections, they all build up into they get packet loss,
and then the total throughput rate is a similar magnitude to using a
single connection - except you out-compete other people with fewer
connections.

Two of the problems with using lots of TCP connections that I'm aware of:

  1. As already said, there's an incentive to use more connections
     than other people to get more bandwidth than other people.  As
     well as being unfair, it leads to a tragedy of the commons where the
     network performs poorly for everyone because everyone is trying to
     compete.

     That's more or less why TCP slow-start was invented.  Congestion
     collapse as a result of everyone having an incentive to
     retransmit their TCP packets too often because they outperformed
     other people.  With slow-start, they cooperate better.

  2. With lots of connections, you don't get any more throughput over
     a link which is all your own.  But you do get slightly worse
     throughput and worse average delays from the interference between
     each connection.  The delays are the main reason not to do it
     over your personal dedicated link, and why HTTP-over-SCTP (and
     it's poor approximations, pipelining/multiplxing over TCP) would
     tend to be better than lots of connections.

Finally, a reason to avoid _lots_ of connections is the same reason why
we have the TCP slow-start algorithm:

  3. Congestion collapse.  Same reason you don't let applications
     "force" TCP to retry packets at high speed.

> Also, download managers 
> generally do multiple simultaneous range requests.  The more parts you 
> request, the more request/response overhead reduces your throughput, so 
> there's an incentive not to go over the top there as well.

Roughly speaking, people use download managers which open lots of
connections to get higher throughput with large files.

The fact that works at all is an indication that something, somewhere
is broken.  Under normal circumstances, a single TCP connection will
attain close to the maximum throughput for a route.  Needing multiple
connections to download a single file is a sign that you're working
around artificial limits, such as policy limits or broken link bonding.

For small things (text-only web pages! :-) where response time is more
important than throughput, lots of connections has the opposite
effect.

But with web pages referencing lots of resources, because of the way
HTTP does not overlap things, some more connections improves total
response time.  It still makes the response time for the main HTML
page longer though!  (But you don't see that).

-- Jamie

Received on Tuesday, 21 July 2009 12:50:42 UTC