- From: Greg Wilkins <gregw@intalio.com>
- Date: Fri, 9 Jan 2015 13:07:01 +0100
- To: Patrick McManus <pmcmanus@mozilla.com>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAH_y2NHqGVGxiG5GhBZFRs5azcthg5PW++YucPg5A8VRS7Q2Rw@mail.gmail.com>
On 8 January 2015 at 22:07, Patrick McManus <pmcmanus@mozilla.com> wrote: > Hey Greg, > > Before delving into the details of your post I wanted to clarify something > that was I think generally misunderstood (the failure in communication no > doubt being mine). > I was over simplifying in my response, but yes I understand that weights are only meaningful between siblings and not globally applicable. I was really just using your blog as an example to expand on my vague thoughts of server side issues. > On Thu, Jan 8, 2015 at 11:43 AM, Greg Wilkins <gregw@intalio.com> wrote: > >> Essentially this approach is converting h2 relative priorities into >> absolute priorities. >> > I'm not quite sure what you mean by that. H2 expresses priority via both > dependencies and weights. The blog post uses them both. > I think you are describing how you will use 5 fixed nodes in the tree to hang off your actual stream dependencies. So I'm saying that those 5 nodes are essentially absolute priorities, expressed in both weights and relationships to each other. This is a good thing as it may allow a server to reverse engineer a simple picture. In this case, if all the streams from FF are given priorities relative to the 5 nodes, then that is something we might be able to use. We might have a simple mapping of those 5 nodes to priorities URGENT, HIGH, NORMAL, LOW, BACKGROUND and then use that simple linear rendering when picking between which streams to allow to use a limited transmit window. The server doesn't really want a complete picture of priority. All it really wants is a simple and cheap way of comparing the priorities of the few streams that have got through parsing and the application has produced some output. Relating those few streams back to 5 fixed nodes is a good way of doing that, but it would be better if those 5 nodes were fixed for all connections not just 1 connection. > there isn't any particular reason that a node in a dependency tree needs > more than 5 bytes of information (plus pointers). > > Theoretically no, but reality is rarely very close to theory. We could store the dependency tree is a parallel data structure to the actual streams, but then we have problems with double work and atomic updates, memory barriers etc. In java, if you want to write lock free algorithms then it is very difficult do deal with 2 objects instead of 1. Or we could just create a fully capable stream object on the first frame. My point is that in a server where every lock, every bad branch prediction, every map look up and every memory barrier can be multiplied by 100s of thousands, it is rarely just a matter of saying oh it is just 5 more nodes of 5 bytes each per connection. > I think you've oversimplified the design. > On purpose:) The chances of the server being able to implement something simple is slim. something complex has got no chance. > The extreme case is the low priority request is a 4GB iso and the > subsequent high priority request is for 30KB of html. H2 defines > multiplexing and priority to deal with that and implementations ignore it > at their peril. > That's not an extreme case. I think of that as the simple case and a server definitely has to let the 30KB response overtake the 4GB response. That is basic multiplexing and should happen regardless of priority! The extreme case is 8 x 0.5GB low priority requests followed by high priority 30KB request. Currently a server is committed to 6 fold parallelism per client, as that is the defacto connection limit of browsers. The fact that h2 gives a multiplexed single connection does not mean that servers will be prepared to commit much more parallelism to an individual connection. If after parsing and launching 8 requests that are more than saturating the connection, the server is justified in using it's memory and CPUs to process other connections rather than commit more resources in the case that a high priority request might be following. It may not even bother parsing the 9th request until progress is made on the first 8 (or until it has idle threads that might work steal). Basically a client that is rendering a HTML page and firing off h2 requests for associated resources as it parses them cannot expect the server to completely unmuddle any inverted priorities in the ordering of those requests. I think the client is going to have to at least do some basic ordering if it expects a busy server to have a chance of applying priorities. > > The ideal situation for a server is that a single thread/core can parse >> handle parse handle parse handle all the requests from a single connection >> in order. >> > > A server that implements that approach still probably generates output > faster than the network can absorb it - that makes either a queue or > backpressure. The priority information lets the server know how it can > reorder the queue to give the end user the best experience. What the server > does with that is a quality of implementation item. > The point I'm making here is that a scalable server is going to use back pressure so that the queuing of requests is not necessarily all at the output stage where priorities can be applied. In a batch of 20 requests you may have 5 queued to send output, 5 still in the application determining what the content is, 5 parsed but not yet dispatched to the application and 5 unparsed in the original IO buffer. The server does not have a smorgasbord of 20 response ready streams to pick from in priority order. In this case it has only 5 requests to pick from and only 15 that it actually knows the priority of. The unknown 5 requests are going to have to wait, regardless of their priority) for other requests to complete or hope the server implements work stealing when idle. It is pretty much the same for any high priority requests in the 5 request queued for the application. Basically clients can't just throw batches of requests at the server in arbitrary priority order and expect the server to be able to unscramble the order just because they also gave a detailed high resolution picture of the relative priorities/weights of those requests. Throttling low priority requests may be counter productive as they end up starving high priority requests of scarce resources of which bandwidth is only 1. cheers -- Greg Wilkins <gregw@intalio.com> @ Webtide - *an Intalio subsidiary* http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that scales http://www.webtide.com advice and support for jetty and cometd.
Received on Friday, 9 January 2015 12:07:30 UTC