Re: why not multiple, short-lived HTTP/2 connections? from Mike Belshe on 2014-06-25 (ietf-http-wg@w3.org from April to June 2014)

From: Mike Belshe <mike@belshe.com>
Date: Wed, 25 Jun 2014 08:47:44 -0700
To: Peter Lepeska <bizzbyster@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABaLYCu_iP5V_8g83St0aFWYYVxHP1QpEK3gwQ_o5V0LVCw6LQ@mail.gmail.com>
On Wed, Jun 25, 2014 at 7:56 AM, <bizzbyster@gmail.com> wrote:

> Thanks for all the feedback. I'm going to try to reply to Mike, Greg,
> Willy, and Guille in one post since a few of you made the same or similar
> points. My apologies in advance for the very long post.
>
> First, you should understand that I am building a browser and web server
> that use the feedback loop described here (
> http://caffeinatetheweb.com/baking-acceleration-into-the-web-itself/) to
> provide the browser with a set of hints inserted into the html that allow
> it to load the page much faster. I prefer subresource hints to server push
> because A) it works in coordination with the browser cache state and B)
> hints can be supplied for resources found on third party domains. But my
> hints also go beyond just supplying the browser with a list of URLs to
> fetch:
> http://lists.w3.org/Archives/Public/public-web-perf/2014Jun/0044.html.
> They also include an estimate of the size of the objects for instance.
>
> Okay so this is all relevant because it means that I often know the large
> number of objects (sometimes 50+) I need to fetch from a given server up
> front and therefore have to figure out the optimal way to retrieve these
> objects. Unlike Mike's tests, my tests have shown that a pool with multiple
> connections is faster than a single one, perhaps because my server hints
> allow me to know about a much larger number of URLs up front and because I
> often have expected object sizes.
>

With appropriate hand-crafted and customized server hints, I'm not
surprised that you can outpace a single connection in some scenarios.

But, the answer to "which is faster" will not be a boolean yes/no - you
have to look at all sorts of network conditions, including link speed, RTT,
and packet loss.

The way I chose how to optimize was based on studying how networks are
evolving over time:
   a) we know that bandwidth is going up fairly rapidly to end users
   b) we know that RTT is not changing much, and in some links going up
   c) packet loss is getting better, but is very difficult to pin down &
even harder to appropriately model (tail drops vs random, etc)

So I chose to optimize assuming that (b) will continue to hold true.  In
your tests, you should try jacking the RTT up to 100ms+.  Average RTT to
Google is ~100ms (my data slightly old here).  If you're on a super fast
link, then sure, initcwnd will be your bottleneck, because RTT is not a
factor.  What RTTs did you simulate?  I'm guessing you were using a
high-speed local network?

Overall, the single connection has some drawbacks on some networks, but by
and large it works better while also providing real server efficiencies and
finally giving the transport the opportunity to do its job better.  When we
split onto zillions of connections, we basically sidestep all of the
transport layer's goodness.  (This is a complex topic too, however).


> If I need to fetch 6 small objects (each the size of a single full packet)
> from a server that has an initcwnd of 3,
>

Why use a server with cwnd of 3?  Default linux distros ship with 10 today
(and have done so for like 2 years).




> I can request 3 objects on each of two connections and download those
> objects in a single round trip. This is not a theoretical idea -- I have
> tested this and I get the expected performance. In general, a pool of cold
> HTTP/2 connections is much faster than a single cold connection for
> fetching a large number of small objects, especially when you know the size
> up front. I will share the data and demo as soon as I'm able to.
>

Many have tested this heavily too, so I believe your results.  My own test
data fed into
https://developers.google.com/speed/articles/tcp_initcwnd_paper.pdf


> Since I know that multiple connections is faster, I can imagine a solution
> that web performance optimizers will resort to if browsers only support one
> connections per host: domain sharding! Let's avoid this by removing the
> SHOULD NOT from the spec.
>

I think we need to get into the nitty gritty benchmarking details if we
want to claim that a single connection is faster.  I highly doubt this is
true for all network types.



>
> "Servers must keep open idle connections, making load balancing more
> complex and creating DOS vulnerability." A few of you pointed out that the
> server can close them. That's true. I should not have said "must". But
> Mark's Ops Guide suggests that browsers will aggressively keep open idle
> connections for performance reasons, and that servers should support this
> by not closing these connections. And also servers should keep those
> connections fast by disabling slow start after idle. In my opinion,
> browsers should keep connections open only as long as they have the
> expectation of imminent requests to issue on those connections, which is
> essentially the way that mainstream browsers handle connection lifetimes
> for HTTP/1.1 connections today. We should not create an incentive for
> browsers to hold on to connections for longer than this and to encourage
> servers to support longer lived idle connections than they already do today.
>

What we really need is just a better transport.  We should have 'forever
connections'.  The idea that the endpoints need to maintain state to keep
connections open is so silly; session resumption without a round-trip is
very doable.   I believe QUIC doe this :-)


>
> Some of you pointed out that a single connection allows us to get back to
> fair congestion control. But TCP slow start and congestion control are
> designed for transferring large objects. They unfairly penalize
> applications that need to fetch a large number of small objects. Are we
> overflowing router buffers today b/c we are using 6 connections per host? I
> agree that reducing that number is a good thing, which HTTP/2 will
> naturally enable. But I don't see any reason to throttle web browsers down
> to a single slow started connection. Also again, web browser and site
> developers will work around this artificial limit. In the future we will
> see 50+ Mbps last mile networks as the norm. This makes extremely fast page
> load times possible, if only we can mitigate the impact of latency by
> increasing the concurrency of object requests. I realize that QUIC may
> eventually solve this issue but in the meantime we need to be able to use
> multiple TCP connections to squeeze the most performance out of today's web.
>

A tremendous amount of research has gone into this, and you're asking good
questions to which nobody knows the exact answers.  Its not that we don't
know the answers for not trying - its because there are so many
combinations of network equipment, speeds, configs, etc, in the real world
that all real world data is a mix of errors.  Given the research that has
gone into it, I wouldn't expect these answers to come crisply or quickly.

I agree we'll have 50+Mbps in the not-distant future.  But so far, there is
no evidence that we're figuring out how to bring RTT's down.  Hence, more
bandwidth doesn't matter much:
https://docs.google.com/a/chromium.org/viewer?a=v&pid=sites&srcid=Y2hyb21pdW0ub3JnfGRldnxneDoxMzcyOWI1N2I4YzI3NzE2

Mike




>
> Thanks for reading through all this,
>
> Peter
>
> On Jun 24, 2014, at 3:55 PM, Mike Belshe <mike@belshe.com> wrote:
>
>
>
>
> On Tue, Jun 24, 2014 at 10:50 AM, <bizzbyster@gmail.com> wrote:
>
>> I've raised this issue before on the list but it's been a while and
>> reading Mark's ops guide doc (
>> https://github.com/http2/http2-spec/wiki/Ops) I'm reminded that
>> requiring the use of a single connection for HTTP/2 ("Clients SHOULD NOT
>> open more than one HTTP/2 connection") still makes no sense to me. Due
>> to multiplexing, HTTP/2 will naturally use FEWER connections than HTTP/1,
>> which is a good thing, but requiring a single connection has the following
>> drawbacks:
>>
>>
>>    1. Servers must keep open idle connections, making load balancing
>>    more complex and creating DOS vulnerability.
>>
>>
> As others have mentioned, you don't have to do this.
>
>>
>>    1. Servers must turn off *tcp_slow_start_after_idle* in order for
>>    browsers to get good performance, again creating DOS vulnerability.
>>
>> You also don't have to do this; it will drop back to init cwnd levels if
> you do, just as though you had opened a fresh connection.
>
>
>>
>>    1. The number of simultaneous GET requests I'm able to upload in the
>>    first round trip is limited to the compressed amount that can fit in a
>>    single initcwnd. Yes compression helps with this but if I use multiple
>>    connections I will get the benefit of compression for the requests on the
>>    same connection, in addition to having multiple initcwnds!
>>
>> It turns out that a larger initcwnd just works better anyway - there was
> a tremendous amount of evidence supporting going up to 10, and that was
> accepted at in the transport level already.
>
>
>>
>>    1. The amount of data I'm able to download in the first round trip is
>>    limited to the amount that can fit in a single initcwnd.
>>
>> It turns out the browser doesn't really know how many connections to open
> until that first resource is downloaded anyway.  Many out-of-band tricks
> exist.
>
>
>>
>>    1. Head of line blocking is exacerbated by putting all objects on a
>>    single connection.
>>
>> Yeah, this is true.  But overall, its still faster and more efficient.
>
>
>
>
>>
>> Multiple short-lived HTTP/2 connections gives us all the performance
>> benefits of multiplexing without any of the operational or performance
>> drawbacks. As a proxy and a browser implementor, I plan to use multiple
>> HTTP/2 connections when talking to HTTP/2 servers because it seems like the
>> right thing to do from a performance, security, and operational perspective.
>>
>
> When I tested the multi-connection scenarios they were all slower for me.
>  In cases of severe packet loss, it was difficult to discern as expected.
>  But overall, the reduced server resource use and the efficiency outweighed
> the negatives.
>
> Mike
>
>
>
>>
>> I know it's very late to ask this but can we remove the "SHOULD NOT"
>> statement from the spec? Or, maybe soften it a little for those of us who
>> cannot understand why it's there?
>>
>> Thanks,
>>
>> Peter
>>
>
>
>
Received on Wednesday, 25 June 2014 15:48:12 UTC