Re: why not multiple, short-lived HTTP/2 connections?

I'm very much in favor of QUIC. But it is not widely deployed and as you say I have no control over when it will be. In the meantime, we can make complex web pages load much faster than they do today. Why wait for QUIC?

Peter


On Jul 1, 2014, at 12:28 AM, William Chan (陈智昌) <willchan@chromium.org> wrote:

> On Mon, Jun 30, 2014 at 11:45 AM, <bizzbyster@gmail.com> wrote:
> Comments inline.
> 
> On Jun 30, 2014, at 2:21 PM, William Chan (陈智昌) <willchan@chromium.org> wrote:
> 
>> On Mon, Jun 30, 2014 at 11:14 AM, <bizzbyster@gmail.com> wrote:
>> "But simply opening up more connections is not the right solution. You can easily hit the other problem of too much congestion leading to way worse performance. "
>> 
>> That is true. But assuming you are able to detect that a browser is running over a 20 Mbps FIOS link, and that due to network scale stripping he/she will never exceed a 64 KB window for a given TCP connection, are you saying that as a chrome developer you would still recommend using just one connection? I have trouble understanding that. 
>> 
>> As a browser developer, we'll do so if we feel it's the best option. But I think we're still a good ways from that. Thankfully most of the internet does not have this wscale stripping problem, and we're discussing ways to encourage access network operators in the relevant regions to fix their deployments (talk to them, shame them, etc). On almost all OSes, we can't actually detect that wscale stripping happened. It's not exposed to the application. And by the time we detect this happened, we've already lost roundtrips.
> 
> Right. I'd like to see us figure out technologies that allow the Internet to self-heal. Servers should learn which access networks have bad properties and perform actions automatically to address them. In this case it'd be great if the protocol allowed some way for the server to tell the browser -- hey, due to no fault of your own you are running with one hand  behind your back, use two! As you know, I'm working on ideas (http://caffeinatetheweb.com/baking-acceleration-into-the-web-itself/) that will enable such a feedback loop. Shaming network operators as a way to fix a problem is something Google has in its toolbox. For the rest of us, we need technology to solve these problems.
> 
>> 
>> Instead of mandating one connection because there are cases where multiple connections increase congestion and hurt performance, let's detect those cases and choose a single connection for the reason that it gives us the fastest page load time.
>> 
>> It's not mandated. It's a SHOULD, and rightly so.
> 
>>  
>> 
>> In other words, let's let the browser figure out the optimal number in each case. If we don't, then we encourage domain sharding for sites that decide they care more about their high bandwidth users.
>> 
>> And the optimal number SHOULD be 1. In almost all cases.
> 
> Over high bandwidth links where I can keep multiple connections busy because I know a large number of the resources I need up front and so avoid slow start on idle, it is never the optimal number. That's why SHOULD seems wrong to me. I'd like to encourage browsers to dynamically find the optimal number and not mandate a number that is so often going to lead to slower user experience and then long term encourage domain sharing.
> 
> I feel like the one thing I've said repeated that you have ignored (probably because you feel it's out of your control) is the proposal that we *fix the transport*. You keep advocating for application level workarounds that you know are suboptimal. You feel like we browser vendors should be investing in a bunch of detection heuristics for these edge cases (and fair enough, edge cases might be the *normal* case for certain users). The problem is these detection heuristics aren't easy to implement, they're fragile, and they take awhile to detect, and it's hard to determine how many connections to open up without causing other problems. And using multiple connections have their own downsides which I've explained elsewhere. This is why we'd rather invest our energies in fixing the transport issues instead. We believe it's the right long-term solution.
> 
>  
> 
>>  
>> 
>> Thanks,
>> 
>> Peter
>> 
>> On Jun 30, 2014, at 1:40 PM, William Chan (陈智昌) <willchan@chromium.org> wrote:
>> 
>>> On Mon, Jun 30, 2014 at 9:58 AM, Patrick McManus <mcmanus@ducksong.com> wrote:
>>> 
>>> 
>>> 
>>> On Mon, Jun 30, 2014 at 12:04 PM, <bizzbyster@gmail.com> wrote:
>>> All,
>>> 
>>> Another huge issue is that for some reason I still see many TCP connections that do not advertise support for window scaling in the SYN packet. I'm really not sure why this is but for instance WPT test instances are running Windows 7 and yet they do not advertise window scaling and so TCP connections max out at a send window of 64 KB. I've seen this in tests run out of multiple different WPT test locations.
>>> 
>>> It's true that TCP window scaling can be a problem. We definitely see this issue in a number of places around the world, most prominently in APAC at certain ISPs (due to network wscale stripping, UGH!). But simply opening up more connections is not the right solution. You can easily hit the other problem of too much congestion leading to way worse performance. I talk about these multiple connection and congestion issues at https://insouciant.org/tech/network-congestion-and-web-browsing/ and provide several example traces of problematic congestion. Fundamentally, this is a transport issue and we should be fixing the transport. Indeed, we're working on this at Google, both with our Make TCP Fast team and our QUIC team.
>>>  
>>> 
>>> The impact of this is that high latency connections max out at very low throughputs. Here's an example (with tcpdump output so you can examine the TCP flow on the wire) where I download data from a SPDY-enabled web server in Virginia from a WPT test instance running in Sydney: http://www.webpagetest.org/result/140629_XG_1JC/1/details/. Average throughput is not even 3 Mbps despite the fact that I chose a 20 Mbps FIOS connection for my test. Note that when I disable SPDY on this web server, I render the page almost twice as fast because I am using multiple connections and therefore overcoming the per connection throughput limitation: http://www.webpagetest.org/result/140629_YB_1K5/1/details/.
>>> 
>>> I don't know the root cause (Windows 7 definitely sends windows scaling option in SYN in other tests) and have sent a note to the webpagetest.org admin but in general there are reasons why even Windows 7 machines sometimes appear to not use Windows scaling, causing single connection SPDY to perform really badly even beyond the slow start phase.
>>> 
>>> 
>>> I think this is a WPT issue you should take up offlist because , IIRC, the issue would just be in the application. its not a OS or infrastructure thing we'll need to cope with.
>>> 
>>> I agree it's probably specific to WPT. Here's the cloudshark trace for the same WPT run (http://www.webpagetest.org/result/140630_HY_SGK/) using a different Chrome instance (from Dulles, VA): https://www.cloudshark.org/captures/0bd0a7aa3a49?filter=tcp.flags.syn%3D%3D1. As you can see, the window scaling option is on there. And the packet trace is taken at the client. So that lends credence to the hypothesis this problem is local to the Sydney WPT Chrome instance in your test run.
>>>  
>>> 
>>>  IIRC when I last looked at it if you used an explicit SO_RCVBUF on your socket before opening on win 7 it would set the scaling factor to the smallest factor that was able to accommodate your desired window. (so if you set it to 64KB or less, scaling would be disabled). Of course there is no way to renegotiate scaling, so that sticks with you for the life of the connection no matter what you might set RCVBUF to along the way. I believe the correct fix is "don't do that" and any new protocol implementation should be able to take that into consideration.
>>> 
>>> but maybe my info is dated.
>>> 
>>> -P
>>> 
>>> 
>>> 
>> 
>> 
> 
> 

Received on Tuesday, 1 July 2014 18:49:31 UTC