Re: Experiences with HTTP/2 server push

On Thu, Sep 1, 2016 at 9:38 AM, Tom Bergan <tombergan@chromium.org> wrote:

> Thanks for the feedback and link to that workshop talk! A few comments
> inline.
>
> On Wed, Aug 31, 2016 at 9:57 PM, Kazuho Oku <kazuhooku@gmail.com> wrote:
>>
>> Consider the case where a large HTML that loads a CSS is sent over the
>> wire. In a typical implementation, the server will pass a block of
>> HTML much larger than INITCWND to the TCP stack before recognizing the
>> request for CSS. So the client would need to wait for multiple RTTs
>> before starting to receive the CSS.
>>
>
> Unrelated to your above comment -- I think servers should use a higher
> initcwnd with H2, and I know that some servers do this. The experiments in
> our doc used Linux's default initcwnd (10 packets). If you compare that to
> H1, where browsers use 6 concurrent connections, the effective initcwnd for
> H1 is 60 packets (well, not exactly, since the browser only makes one
> request initially, but as soon as the browser starts making additional
> requests, cwnd effectively grows much faster than it would with a single
> connection).
>
>
>> That said, as discussed at the workshop, it is possible to implement a
>> HTTP/2 server that does not get affected by HoB between the different
>> streams (see https://github.com/HTTPWorkshop/workshop2016/blob/master/
>> talks/tcpprog.pdf).
>>
>> I would suggest that regardless of whether or not push is used, server
>> implementors should consider adopting such approach to minimize the
>> impact of HoB.
>>
>
> This is really interesting. To summarize: the idea is to use getsockopt to
> compute the number of available bytes in cwnd so that sizeof(kernel buffer)
> = cwnd. I rejected this idea without thinking about it much because it
> seemed like it would increase kernel/user round-trips and perform poorly in
> bursty conditions. But, your idea to restrict this optimization to cases
> where it matters most makes sense. Do you have performance measurements of
> this idea under heavy load? Are you using TCP_NOTSENT_LOWAT for cases where
> the optimization cannot be used?
>
>
>> It should also be noted that with QUIC such HoB would not be an issue
>> since there would no longer be a send buffer within the kernel.
>>
>
> Yep, this is definitely an advantage of QUIC.
>
> "Rule 2: Push Resources in the Right Order"
>>
>> My take is that the issue can / should be solved by clients sending
>> PRIORITY frames for pushed resources when they observe how the
>> resources are used, and that until then servers should schedule the
>> pushed streams separately from the client-driven prioritization tree
>> (built by using the PRIORITY frames).
>>
>> Please refer to the discussion in the other thread for details:
>> https://lists.w3.org/Archives/Public/ietf-http-wg/2016JulSep/0453.html
>
>
> To make sure I understand the idea: Suppose you send HTML, then push
> resources X and Y. You will continue pushing X and Y until you get requests
> from the client, at which point you switch to serving requests made by the
> client (which may or may not include X and Y, as the client may not know
> about those resources yet, depending on what you decided to push). These
> client requests are served via the client-driven priority tree.
>
> Is that right? If so, you've basically implemented rule #1 -- the push
> lasts while the network is idle, then you switch to serving client requests
> afterwards. It's nice to see that we came to the same high-level conclusion
> :-). But, I like the way you've phrased the problem. Instead of computing a
> priori out how much data you should push, which we suggested, you start
> pushing an arbitrary number of things, then you'll automatically stop
> pushing as soon as you get the next client request.
>

On second thought: this doesn't solve the full problem. What if you push X,
but the client starts requesting other resources because it doesn't know
about X yet? e.g., this might happen if X is a script loaded via
document.write and the browser hasn't evaluated that code yet. The priority
tree has a default position for X (RFC, section 5.3.5) and the server
cannot know if the client is happy with this default priority for X or if
the client hasn't corrected that priority because it doesn't know about X
yet.


> One more clarification: what happens when the client loads two pages
> concurrently and the network is effectively never idle? I assume push won't
> happen in this case?
>
> Next, I think you're arguing that push order doesn't matter as long as you
> have a solution for HoB. I don't think this is exactly right. Specifically:
>
> - Head-of-link blocking (HoB) can happen due to network-level bufferbloat.
> The above solution only applies to kernel-level bufferbloat. You need some
> kind of bandwidth-based pacing to avoid network-level bufferbloat.
>
> - If you're pushing X and Y, and you know the client will use X before Y,
> you should push in that order. The opposite order is sub-optimal and can
> eliminate the benefit of push in some cases, even ignoring HoB.
>
> As a server implementor, I have always dreamt of cancelling a push
>> after sending a PUSH_PROMISE. In case a resource we want to push
>> exists on a dedicate cache that cannot be reached synchronously from
>> the HTTP/2 server, the server needs to send PUSH_PROMISE without the
>> guarantee that it would be able to push a valid response.
>>
>> It would be great if we could have an error code that can be sent
>> using RST_STREAM to notify the client that it should discard the
>> PUSH_PROMISE being sent, and issue a request by itself.
>>
>
> Yes, +1. I've wanted this feature. It sucks that the client won't reissue
> the requests if they get a RST_STREAM. (At least, Chrome won't do this, I
> don't know about other browsers.)
>

Received on Thursday, 1 September 2016 17:26:40 UTC