- From: Bryan McQuade <bmcquade@google.com>
- Date: Sun, 04 Dec 2016 12:45:10 +0000
- To: Kazuho Oku <kazuhooku@gmail.com>, Tom Bergan <tombergan@chromium.org>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CADLGQyAm5yTr+RFEnZ=RhwRg6S42CY=OooRTw8DA9ta_YDiihA@mail.gmail.com>
Here's a new article on H2 push from today's perf calendar which goes into a good bit of detail: http://calendar.perfplanet.com/2016/http2-push-the-details/ On Thu, Sep 1, 2016 at 5:39 PM Kazuho Oku <kazuhooku@gmail.com> wrote: > Hi, > > Thank you for your response. > > 2016-09-02 1:38 GMT+09:00 Tom Bergan <tombergan@chromium.org>: > > Thanks for the feedback and link to that workshop talk! A few comments > > inline. > > > > On Wed, Aug 31, 2016 at 9:57 PM, Kazuho Oku <kazuhooku@gmail.com> wrote: > >> > >> Consider the case where a large HTML that loads a CSS is sent over the > >> wire. In a typical implementation, the server will pass a block of > >> HTML much larger than INITCWND to the TCP stack before recognizing the > >> request for CSS. So the client would need to wait for multiple RTTs > >> before starting to receive the CSS. > > > > > > Unrelated to your above comment -- I think servers should use a higher > > initcwnd with H2, and I know that some servers do this. The experiments > in > > our doc used Linux's default initcwnd (10 packets). If you compare that > to > > H1, where browsers use 6 concurrent connections, the effective initcwnd > for > > H1 is 60 packets (well, not exactly, since the browser only makes one > > request initially, but as soon as the browser starts making additional > > requests, cwnd effectively grows much faster than it would with a single > > connection). > > > >> > >> That said, as discussed at the workshop, it is possible to implement a > >> HTTP/2 server that does not get affected by HoB between the different > >> streams (see > >> > https://github.com/HTTPWorkshop/workshop2016/blob/master/talks/tcpprog.pdf > ). > >> > >> I would suggest that regardless of whether or not push is used, server > >> implementors should consider adopting such approach to minimize the > >> impact of HoB. > > > > > > This is really interesting. To summarize: the idea is to use getsockopt > to > > compute the number of available bytes in cwnd so that sizeof(kernel > buffer) > > = cwnd. I rejected this idea without thinking about it much because it > > seemed like it would increase kernel/user round-trips and perform poorly > in > > bursty conditions. But, your idea to restrict this optimization to cases > > where it matters most makes sense. Do you have performance measurements > of > > this idea under heavy load? > > Unfortunately not. > > I agree that it would be interesting to collect metrics based on real > workload, both on the client side and the server side. > > OTOH let me note that since we enable the optimization only for > connections with RTT substantially higher than the time spent by a > single iteration of the event loop, we expect that there would be no > performance penalty when facing a burst. The server would just switch > to the ordinary way. > > > Are you using TCP_NOTSENT_LOWAT for cases where > > the optimization cannot be used? > > No. I'm not sure if restricting the amount of unsent data to a fixed > value is generally a good thing, or if that causes practical impact on > performance. > > Personally, for connections that left the slow-start phase, I prefer > the amount calculated proportional to the current CWND value, which > IIRC is the default behavior of Linux. > > >> > >> It should also be noted that with QUIC such HoB would not be an issue > >> since there would no longer be a send buffer within the kernel. > > > > > > Yep, this is definitely an advantage of QUIC. > > > >> "Rule 2: Push Resources in the Right Order" > >> > >> My take is that the issue can / should be solved by clients sending > >> PRIORITY frames for pushed resources when they observe how the > >> resources are used, and that until then servers should schedule the > >> pushed streams separately from the client-driven prioritization tree > >> (built by using the PRIORITY frames). > >> > >> Please refer to the discussion in the other thread for details: > >> https://lists.w3.org/Archives/Public/ietf-http-wg/2016JulSep/0453.html > > > > > > To make sure I understand the idea: Suppose you send HTML, then push > > resources X and Y. You will continue pushing X and Y until you get > requests > > from the client, at which point you switch to serving requests made by > the > > client (which may or may not include X and Y, as the client may not know > > about those resources yet, depending on what you decided to push). These > > client requests are served via the client-driven priority tree. > > > > Is that right? If so, you've basically implemented rule #1 > > Actually not. > > My interpretation of rule #1 (or the solution proposed for rule #1) > was that it discusses the impact of TCP-level head-of-line blocking, > whereas rule #2 seemed to discuss the issues caused by pushed streams > not appropriately prioritized against the pulled streams. > > And the solution for rule #2 that I revisited here was for a server to > prioritize _some_ of the pushed streams outside the client-driven > priority tree. > > I am not copy-pasting the scheme described in > https://lists.w3.org/Archives/Public/ietf-http-wg/2016JulSep/0453.html > in fear that doing so might lose context, but as an example, it would > go like this. > > Suppose you are sending HTML (in response to a pull), as well as > pushing two asset files: one is CSS and one is an image. > > Among the two assets, it is fair for a server to anticipate that the > CSS is likely to block the rendering of the HTML. Therefore, the > server sends CSS before HTML (but does not send a PRIORITY frame for > the CSS, since PRIORITY frame is a tool for controlling client-driven > prioritization). OTOH an image is not likely to block the rendering. > Therefore, it is scheduled as specified by the HTTP/2 specification > (so that it would be sent after the HTML). > > This out-of-client-driven-priotization-tree scheduling should be > performed until a server receives a PRIORITY frame adjusting the > precedence of a pushed stream. At this point, a server should > reprioritize the pushed stream (i.e. CSS) if it considers client's > knowledge of how the streams should be prioritized superior to what > the server knows. > > > -- the push lasts > > while the network is idle, then you switch to serving client requests > > afterwards. It's nice to see that we came to the same high-level > conclusion > > :-). But, I like the way you've phrased the problem. Instead of > computing a > > priori out how much data you should push, which we suggested, you start > > pushing an arbitrary number of things, then you'll automatically stop > > pushing as soon as you get the next client request. > > > > One more clarification: what happens when the client loads two pages > > concurrently and the network is effectively never idle? I assume push > won't > > happen in this case? > > > > Next, I think you're arguing that push order doesn't matter as long as > you > > have a solution for HoB. I don't think this is exactly right. > Specifically: > > > > - Head-of-link blocking (HoB) can happen due to network-level > bufferbloat. > > The above solution only applies to kernel-level bufferbloat. You need > some > > kind of bandwidth-based pacing to avoid network-level buffer bloat. > > That's correct. > > OTOH I would like to point out that the issue is irrelevant to push. > > A client would issue requests in the order it notices the URLs that it > should fetch. And it cannot update the priority of the links found in > LRP headers until it observes how the resource is actually used. > > So if preload links included low-priority assets, bufferbloat can (or > will) cause issues for both pull and push. > > > - If you're pushing X and Y, and you know the client will use X before Y, > > you should push in that order. The opposite order is sub-optimal and can > > eliminate the benefit of push in some cases, even ignoring HoB. > > Agreed. > > And my understanding is that both Apache and H2O does this, based on > the content-type of the pushed response. > > Just having two (or three) levels of precedence (send before HTML vs. > send after HTML vs. send along with HTML) is not as complex as what > HTTP/2's prioritization tree provides, but I think is sufficient for > optimizing the time spent until first-render. > > What would be the best way to prioritize the blocking assets (i.e. an > asset that needs to be sent before HTML, e.g. CSS) is what Apache and > H2O disagree. And my proposal (and what H2O does in that respect) is > that a server should schedule such pushed streams outside the > prioritization tree (i.e. my response for rule #2). > > >> As a server implementor, I have always dreamt of cancelling a push > >> after sending a PUSH_PROMISE. In case a resource we want to push > >> exists on a dedicate cache that cannot be reached synchronously from > >> the HTTP/2 server, the server needs to send PUSH_PROMISE without the > >> guarantee that it would be able to push a valid response. > >> > >> It would be great if we could have an error code that can be sent > >> using RST_STREAM to notify the client that it should discard the > >> PUSH_PROMISE being sent, and issue a request by itself. > > > > > > Yes, +1. I've wanted this feature. It sucks that the client won't reissue > > the requests if they get a RST_STREAM. (At least, Chrome won't do this, I > > don't know about other browsers.) > > > > -- > Kazuho Oku > >
Received on Sunday, 4 December 2016 12:46:10 UTC