Re: why not multiple, short-lived HTTP/2 connections? from bizzbyster@gmail.com on 2014-06-25 (ietf-http-wg@w3.org from April to June 2014)

From: <bizzbyster@gmail.com>
Date: Wed, 25 Jun 2014 10:56:00 -0400
To: Mike Belshe <mike@belshe.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <692E94AC-1413-4C2A-ADDF-9D945233094A@gmail.com>
Thanks for all the feedback. I'm going to try to reply to Mike, Greg, Willy, and Guille in one post since a few of you made the same or similar points. My apologies in advance for the very long post.

First, you should understand that I am building a browser and web server that use the feedback loop described here (http://caffeinatetheweb.com/baking-acceleration-into-the-web-itself/) to provide the browser with a set of hints inserted into the html that allow it to load the page much faster. I prefer subresource hints to server push because A) it works in coordination with the browser cache state and B) hints can be supplied for resources found on third party domains. But my hints also go beyond just supplying the browser with a list of URLs to fetch: http://lists.w3.org/Archives/Public/public-web-perf/2014Jun/0044.html. They also include an estimate of the size of the objects for instance.

Okay so this is all relevant because it means that I often know the large number of objects (sometimes 50+) I need to fetch from a given server up front and therefore have to figure out the optimal way to retrieve these objects. Unlike Mike's tests, my tests have shown that a pool with multiple connections is faster than a single one, perhaps because my server hints allow me to know about a much larger number of URLs up front and because I often have expected object sizes. If I need to fetch 6 small objects (each the size of a single full packet) from a server that has an initcwnd of 3, I can request 3 objects on each of two connections and download those objects in a single round trip. This is not a theoretical idea -- I have tested this and I get the expected performance. In general, a pool of cold HTTP/2 connections is much faster than a single cold connection for fetching a large number of small objects, especially when you know the size up front. I will share the data and demo as soon as I'm able to.

Since I know that multiple connections is faster, I can imagine a solution that web performance optimizers will resort to if browsers only support one connections per host: domain sharding! Let's avoid this by removing the SHOULD NOT from the spec.

"Servers must keep open idle connections, making load balancing more complex and creating DOS vulnerability." A few of you pointed out that the server can close them. That's true. I should not have said "must". But Mark's Ops Guide suggests that browsers will aggressively keep open idle connections for performance reasons, and that servers should support this by not closing these connections. And also servers should keep those connections fast by disabling slow start after idle. In my opinion, browsers should keep connections open only as long as they have the expectation of imminent requests to issue on those connections, which is essentially the way that mainstream browsers handle connection lifetimes for HTTP/1.1 connections today. We should not create an incentive for browsers to hold on to connections for longer than this and to encourage servers to support longer lived idle connections than they already do today.

Some of you pointed out that a single connection allows us to get back to fair congestion control. But TCP slow start and congestion control are designed for transferring large objects. They unfairly penalize applications that need to fetch a large number of small objects. Are we overflowing router buffers today b/c we are using 6 connections per host? I agree that reducing that number is a good thing, which HTTP/2 will naturally enable. But I don't see any reason to throttle web browsers down to a single slow started connection. Also again, web browser and site developers will work around this artificial limit. In the future we will see 50+ Mbps last mile networks as the norm. This makes extremely fast page load times possible, if only we can mitigate the impact of latency by increasing the concurrency of object requests. I realize that QUIC may eventually solve this issue but in the meantime we need to be able to use multiple TCP connections to squeeze the most performance out of today's web.

Thanks for reading through all this,

Peter

On Jun 24, 2014, at 3:55 PM, Mike Belshe <mike@belshe.com> wrote:

> 
> 
> 
> On Tue, Jun 24, 2014 at 10:50 AM, <bizzbyster@gmail.com> wrote:
> I've raised this issue before on the list but it's been a while and reading Mark's ops guide doc (https://github.com/http2/http2-spec/wiki/Ops) I'm reminded that requiring the use of a single connection for HTTP/2 ("Clients SHOULD NOT open more than one HTTP/2 connection") still makes no sense to me. Due to multiplexing, HTTP/2 will naturally use FEWER connections than HTTP/1, which is a good thing, but requiring a single connection has the following drawbacks:
> 
> Servers must keep open idle connections, making load balancing more complex and creating DOS vulnerability.
> 
> As others have mentioned, you don't have to do this. 
> Servers must turn off tcp_slow_start_after_idle in order for browsers to get good performance, again creating DOS vulnerability.
> You also don't have to do this; it will drop back to init cwnd levels if you do, just as though you had opened a fresh connection.
>  
> The number of simultaneous GET requests I'm able to upload in the first round trip is limited to the compressed amount that can fit in a single initcwnd. Yes compression helps with this but if I use multiple connections I will get the benefit of compression for the requests on the same connection, in addition to having multiple initcwnds!
> It turns out that a larger initcwnd just works better anyway - there was a tremendous amount of evidence supporting going up to 10, and that was accepted at in the transport level already.
>  
> The amount of data I'm able to download in the first round trip is limited to the amount that can fit in a single initcwnd.
> It turns out the browser doesn't really know how many connections to open until that first resource is downloaded anyway.  Many out-of-band tricks exist.
>  
> Head of line blocking is exacerbated by putting all objects on a single connection.
> Yeah, this is true.  But overall, its still faster and more efficient.
> 
> 
>  
> 
> Multiple short-lived HTTP/2 connections gives us all the performance benefits of multiplexing without any of the operational or performance drawbacks. As a proxy and a browser implementor, I plan to use multiple HTTP/2 connections when talking to HTTP/2 servers because it seems like the right thing to do from a performance, security, and operational perspective.
> 
> When I tested the multi-connection scenarios they were all slower for me.  In cases of severe packet loss, it was difficult to discern as expected.  But overall, the reduced server resource use and the efficiency outweighed the negatives.
> 
> Mike
> 
>  
> 
> I know it's very late to ask this but can we remove the "SHOULD NOT" statement from the spec? Or, maybe soften it a little for those of us who cannot understand why it's there?
> 
> Thanks,
> 
> Peter
>
Received on Wednesday, 25 June 2014 14:56:31 UTC