- From: Kazuho Oku <kazuhooku@gmail.com>
- Date: Fri, 2 Sep 2016 06:33:34 +0900
- To: Tom Bergan <tombergan@chromium.org>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
Hi, Thank you for your response. 2016-09-02 1:38 GMT+09:00 Tom Bergan <tombergan@chromium.org>: > Thanks for the feedback and link to that workshop talk! A few comments > inline. > > On Wed, Aug 31, 2016 at 9:57 PM, Kazuho Oku <kazuhooku@gmail.com> wrote: >> >> Consider the case where a large HTML that loads a CSS is sent over the >> wire. In a typical implementation, the server will pass a block of >> HTML much larger than INITCWND to the TCP stack before recognizing the >> request for CSS. So the client would need to wait for multiple RTTs >> before starting to receive the CSS. > > > Unrelated to your above comment -- I think servers should use a higher > initcwnd with H2, and I know that some servers do this. The experiments in > our doc used Linux's default initcwnd (10 packets). If you compare that to > H1, where browsers use 6 concurrent connections, the effective initcwnd for > H1 is 60 packets (well, not exactly, since the browser only makes one > request initially, but as soon as the browser starts making additional > requests, cwnd effectively grows much faster than it would with a single > connection). > >> >> That said, as discussed at the workshop, it is possible to implement a >> HTTP/2 server that does not get affected by HoB between the different >> streams (see >> https://github.com/HTTPWorkshop/workshop2016/blob/master/talks/tcpprog.pdf). >> >> I would suggest that regardless of whether or not push is used, server >> implementors should consider adopting such approach to minimize the >> impact of HoB. > > > This is really interesting. To summarize: the idea is to use getsockopt to > compute the number of available bytes in cwnd so that sizeof(kernel buffer) > = cwnd. I rejected this idea without thinking about it much because it > seemed like it would increase kernel/user round-trips and perform poorly in > bursty conditions. But, your idea to restrict this optimization to cases > where it matters most makes sense. Do you have performance measurements of > this idea under heavy load? Unfortunately not. I agree that it would be interesting to collect metrics based on real workload, both on the client side and the server side. OTOH let me note that since we enable the optimization only for connections with RTT substantially higher than the time spent by a single iteration of the event loop, we expect that there would be no performance penalty when facing a burst. The server would just switch to the ordinary way. > Are you using TCP_NOTSENT_LOWAT for cases where > the optimization cannot be used? No. I'm not sure if restricting the amount of unsent data to a fixed value is generally a good thing, or if that causes practical impact on performance. Personally, for connections that left the slow-start phase, I prefer the amount calculated proportional to the current CWND value, which IIRC is the default behavior of Linux. >> >> It should also be noted that with QUIC such HoB would not be an issue >> since there would no longer be a send buffer within the kernel. > > > Yep, this is definitely an advantage of QUIC. > >> "Rule 2: Push Resources in the Right Order" >> >> My take is that the issue can / should be solved by clients sending >> PRIORITY frames for pushed resources when they observe how the >> resources are used, and that until then servers should schedule the >> pushed streams separately from the client-driven prioritization tree >> (built by using the PRIORITY frames). >> >> Please refer to the discussion in the other thread for details: >> https://lists.w3.org/Archives/Public/ietf-http-wg/2016JulSep/0453.html > > > To make sure I understand the idea: Suppose you send HTML, then push > resources X and Y. You will continue pushing X and Y until you get requests > from the client, at which point you switch to serving requests made by the > client (which may or may not include X and Y, as the client may not know > about those resources yet, depending on what you decided to push). These > client requests are served via the client-driven priority tree. > > Is that right? If so, you've basically implemented rule #1 Actually not. My interpretation of rule #1 (or the solution proposed for rule #1) was that it discusses the impact of TCP-level head-of-line blocking, whereas rule #2 seemed to discuss the issues caused by pushed streams not appropriately prioritized against the pulled streams. And the solution for rule #2 that I revisited here was for a server to prioritize _some_ of the pushed streams outside the client-driven priority tree. I am not copy-pasting the scheme described in https://lists.w3.org/Archives/Public/ietf-http-wg/2016JulSep/0453.html in fear that doing so might lose context, but as an example, it would go like this. Suppose you are sending HTML (in response to a pull), as well as pushing two asset files: one is CSS and one is an image. Among the two assets, it is fair for a server to anticipate that the CSS is likely to block the rendering of the HTML. Therefore, the server sends CSS before HTML (but does not send a PRIORITY frame for the CSS, since PRIORITY frame is a tool for controlling client-driven prioritization). OTOH an image is not likely to block the rendering. Therefore, it is scheduled as specified by the HTTP/2 specification (so that it would be sent after the HTML). This out-of-client-driven-priotization-tree scheduling should be performed until a server receives a PRIORITY frame adjusting the precedence of a pushed stream. At this point, a server should reprioritize the pushed stream (i.e. CSS) if it considers client's knowledge of how the streams should be prioritized superior to what the server knows. > -- the push lasts > while the network is idle, then you switch to serving client requests > afterwards. It's nice to see that we came to the same high-level conclusion > :-). But, I like the way you've phrased the problem. Instead of computing a > priori out how much data you should push, which we suggested, you start > pushing an arbitrary number of things, then you'll automatically stop > pushing as soon as you get the next client request. > > One more clarification: what happens when the client loads two pages > concurrently and the network is effectively never idle? I assume push won't > happen in this case? > > Next, I think you're arguing that push order doesn't matter as long as you > have a solution for HoB. I don't think this is exactly right. Specifically: > > - Head-of-link blocking (HoB) can happen due to network-level bufferbloat. > The above solution only applies to kernel-level bufferbloat. You need some > kind of bandwidth-based pacing to avoid network-level buffer bloat. That's correct. OTOH I would like to point out that the issue is irrelevant to push. A client would issue requests in the order it notices the URLs that it should fetch. And it cannot update the priority of the links found in LRP headers until it observes how the resource is actually used. So if preload links included low-priority assets, bufferbloat can (or will) cause issues for both pull and push. > - If you're pushing X and Y, and you know the client will use X before Y, > you should push in that order. The opposite order is sub-optimal and can > eliminate the benefit of push in some cases, even ignoring HoB. Agreed. And my understanding is that both Apache and H2O does this, based on the content-type of the pushed response. Just having two (or three) levels of precedence (send before HTML vs. send after HTML vs. send along with HTML) is not as complex as what HTTP/2's prioritization tree provides, but I think is sufficient for optimizing the time spent until first-render. What would be the best way to prioritize the blocking assets (i.e. an asset that needs to be sent before HTML, e.g. CSS) is what Apache and H2O disagree. And my proposal (and what H2O does in that respect) is that a server should schedule such pushed streams outside the prioritization tree (i.e. my response for rule #2). >> As a server implementor, I have always dreamt of cancelling a push >> after sending a PUSH_PROMISE. In case a resource we want to push >> exists on a dedicate cache that cannot be reached synchronously from >> the HTTP/2 server, the server needs to send PUSH_PROMISE without the >> guarantee that it would be able to push a valid response. >> >> It would be great if we could have an error code that can be sent >> using RST_STREAM to notify the client that it should discard the >> PUSH_PROMISE being sent, and issue a request by itself. > > > Yes, +1. I've wanted this feature. It sucks that the client won't reissue > the requests if they get a RST_STREAM. (At least, Chrome won't do this, I > don't know about other browsers.) -- Kazuho Oku
Received on Thursday, 1 September 2016 21:34:06 UTC