Re: FF Priority experiment

Hey Greg,

Before delving into the details of your post I wanted to clarify something
that was I think generally misunderstood (the failure in communication no
doubt being mine) about the post you mentioned. My closing thoughts tried
to point out that a sender cannot implement a simplified version of
priority by globally comparing weights across streams without also
considering dependencies - It just doesn't do what you might think at first
glance. It isn't immediately obvious to a couple of implementers I've
worked with. I wasn't using that post advocating particular server
implementations, I was just pointing out that dependency and weight aren't
separable for the receiver.

On Thu, Jan 8, 2015 at 11:43 AM, Greg Wilkins <gregw@intalio.com> wrote:

>
> Essentially this approach is converting h2 relative priorities into
> absolute priorities.
>

I'm not quite sure what you mean by that. H2 expresses priority via both
dependencies and weights. The blog post uses them both.


> But instead of being absolute for all browsers and all connections, these
> 5 absolute priorities will be defined for each and every connection from a
> FF-37 browser.    Let's assume that all other browsers adopt a similar but
> varied approach, then for every connection accepted by the server, it can
> expect to have approximately 5 streams created to be held for the entire
> life span of the connection, simply to define it's own absolute priorities.
>

there isn't any particular reason that a node in a dependency tree needs
more than 5 bytes of information (plus pointers).


>
> In short if it is simplest for the browser to reason about priority with
> several absolute priority points, why do we have a relative priority
> mechanism in the protocol?  Why not just define 5, 10 or 256 priorities and
> save the create of fake streams to fill up the servers memory.   It does
> not matter that these values can't be compared between connections, because
> they never are anyway!
>
>
I think you've oversimplified the design. It allows the client to express
"Let group {L, F}, proceed in parallel with group U but within group {L, F}
L should always take precedence over F." Whether that's crazy-good or
crazy-bad is something time will tell - but it isn't accurate to map it to
a small set of priority points afaict.

Anyhow, that's somewhat beside the point because the most convincing
argument for an arbitrary dependency tree is the aggregator (not my use
case from the blog) - or how to fairly combine multiple users. That's what
sold me on it as a design. I'm just trying to meaningfully apply what we
ended up with to what I know is a huge opportunity for responsiveness.

Fewer round trips in h2 will make it much more effective at utilizing
available bandwidth and we'll see those gains quickly. But the priority
mechanism is I think the real long term opportunity for responsive feel and
its going to take multiple iterations from both servers and clients who
want to invest in that process. The nice thing is I think we built a
mechanism capable of expressing those iterations without per-defining them
now. That will help us evolve. In the absence of the aggregation use case I
would have wanted to do that by extension, but not supporting aggregation
was just a bug that had to be solved.


> We are certainly not going to queue it, just so we can parse some more
> frames looking for higher priority requests to handle, as that will just
> fill up our memory, add latency and let our caches get cold before handling
> the requests.
>

I agree - priority is about choosing which of N things to transmit when you
have more than 1 ready to go. Nobody wants idle bandwidth. There is nothing
wrong with replying to a lower priority request before a higher one if the
higher priority response is not available. Is the text unclear on that?
Obviously, you also shouldn't over-fill your socket buffers or you won't be
able to react well when a higher priority item becomes available to
transmit. (this is a variation on the necessity of using modest frame sizes
and more broadly of bufferbloat(tm)).


>
> Now some time later, we will parse another request off the connection.
> Let's say this in a high priority request.  What are we to do with the
> lower priority request already being handled?  Suspend it?  block it by de
> prioritising it's output?
>

another option is buffering its output for at least a little while. This is
what happens in H1 afterall, it just happens in parallel socket buffers
there (which are eventually emptied without priority) but buffering is
buffering.

The extreme case is the low priority request is a 4GB iso and the
subsequent high priority request is for 30KB of html. H2 defines
multiplexing and priority to deal with that and implementations ignore it
at their peril.


> The ideal situation for a server is that a single thread/core can parse
> handle parse handle parse handle all the requests from a single connection
> in order.
>

A server that implements that approach still probably generates output
faster than the network can absorb it - that makes either a queue or
backpressure. The priority information lets the server know how it can
reorder the queue to give the end user the best experience. What the server
does with that is a quality of implementation item.


> Yes I know that means some extra latency, but if you want the server to
> order them then it is going to have to assemble the list anyway and that
> will involve latency and other issues.
>
>
I believe the best implementation does not add latency - it reorders the
output queue and parallelizes/prioritizes execution of the transactions to
the extent its quotas, priority information, and resources let it do so.

Received on Thursday, 8 January 2015 21:07:36 UTC