FF Priority experiment from Greg Wilkins on 2015-01-08 (ietf-http-wg@w3.org from January to March 2015)

From: Greg Wilkins <gregw@intalio.com>
Date: Thu, 8 Jan 2015 17:43:10 +0100
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAH_y2NF=jhWDc8+aJW_JXEcV_GVRg=Nzx_rwr=qZpXjnA6oxoA@mail.gmail.com>
Patrick,

As a h2 server developer, I am definitely reading your blog
http://bitsup.blogspot.it/2015/01/http2-dependency-priorities-in-firefox.html

I thought I'd post some feedback here as even if it is too late for this
draft I think it is good to capture the server side view of your
experiment.   Sorry this is a bit long.... but hard stuff to summarise.

If I have understood your blog correctly, your strategy is to create 5
streams with priority frames that are not used for request/response
streams, but are instead used as kind of fixed reference points against
which you can set the relative priorities of all other streams on the
connection.       I think that given the design of the priority mechanism
that it is a sensible approach to try and looks like it will greatly
simplify priority handling.

However,  I'd like to use your proposal to illustrate that the priority
mechanism does not so great from the server side.

Essentially this approach is converting h2 relative priorities into
absolute priorities.  But instead of being absolute for all browsers and
all connections, these 5 absolute priorities will be defined for each and
every connection from a FF-37 browser.    Let's assume that all other
browsers adopt a similar but varied approach, then for every connection
accepted by the server, it can expect to have approximately 5 streams
created to be held for the entire life span of the connection, simply to
define it's own absolute priorities.

For a large server, that maybe 50,000, or 500,000 or even 5,000,000 complex
stream objects created and held for some time, just to mimic a simple enum
of absolute priority.  Worse still, since they are held for a while, in
java they will migrate from young generation, to old generation, so garbage
collection of these objects is going to be problematic, requiring full stop
the world GC's rather than any no-pause incremental approach.

If these 5 fixed priority points were defined in the protocol, then none of
these objects and the frames to create them would be needed.   The wasted
effort is such that we will probably have to look at some way of creating
shared fake streams so that 5 objects can be shared by all connections.
Problem then is that we then need to come up with browser specific mappings
for FF-37, FF-38, IE-xx, CR-yy etc.

In short if it is simplest for the browser to reason about priority with
several absolute priority points, why do we have a relative priority
mechanism in the protocol?  Why not just define 5, 10 or 256 priorities and
save the create of fake streams to fill up the servers memory.   It does
not matter that these values can't be compared between connections, because
they never are anyway!

Now after these base priorities have been established (either by fake
streams or hopefully eventually by protocol mandate), the server will
eventually see some request frames.  Let's say the first request is for a
lower priority.   Having parsed the request, should the server wait to see
if the next request is higher priority? No - server really wants to handle
it ASAP, because it's cache is hot with the request headers, so ideally we
are going to execute it immediately and in parallel  continue to parsing
and handling subsequent requests.

We are certainly not going to queue it, just so we can parse some more
frames looking for higher priority requests to handle, as that will just
fill up our memory, add latency and let our caches get cold before handling
the requests.

Now some time later, we will parse another request off the connection.
Let's say this in a high priority request.  What are we to do with the
lower priority request already being handled?  Suspend it?  block it by de
prioritising it's output?     No. The  low priority request will already
have resources allocated to it: buffers, thread, database connection, stack
space, application locks etc. and the thread handling it will have a hot
cache.    From the servers point of view, it will only harm total
throughput if it de-prioritises the request as that will waste the
resources and again let the cache get cold.    If enough low priority
requests are de-prioritised, then we could have problems with database
connection pool exhaustion etc.

More over, lets say the client has sent a batch of 20 requests in random
priority order.    The absolutely last thing the server wants to do is to
dispatch all 20 requests and let an output throttle (which is what the h2
priority mechanism is) determine which will progress.     This will result
in a single connection having 20 threads from the same client, smearing
themselves over all the servers cores and dirtying the caches of all the
cores.  They will then contend for the same locks and data structures so
false sharing and parallel slow down will result.

The ideal situation for a server is that a single thread/core can parse
handle parse handle parse handle all the requests from a single connection
in order.  This would allow a single core and hot cache to handle all the
requests without any contention or false sharing.   The other cores on the
sever should be busy handling other connections.  Of course the world is
never ideal and some requests will take longer than others to handle and
serve, so that out of order processing must be allowed to happen.   So
servers are going to need to use multiple threads to handle the connection,
but potentially in a small sub pool or with work stealing rather than mass
dispatch.    Servers will be aiming to use <=4 threads to handle a
connection, so they might be able to keep it on the same socket or CPU.
Thus of a 20 request batch, it may be that only 1 to 4 requests have
actually been parsed and their priorities known.  In such a situation, the
small high priority 20th request is rarely going to overtake the large low
priority first request unless the server has lots of spare capacity to
steal lots of work.

Perhaps instead of expecting the server to create an ordering of all
requests in a batch, the client could itself refrain from sending requests
as it parses HTML.  Instead it could collect the list itself, order them
and then send a batch of requests in priority order.  Yes I know that means
some extra latency, but if you want the server to order them then it is
going to have to assemble the list anyway and that will involve latency and
other issues.

One good feature of the 5 "fixed" node proposal is that it will allow
priorities to work better for push.   The server will hopefully soon learn
that the 20 requests are associated and thus be able to push them rather
than wait for a batch of requests.   If every connection has the same 5
fixed nodes, then the push can use them to assign priorities to the push
resources and push them in priority order.    Of course the problem is that
if the "fixed" priority nodes are different from browser to browser, then
the server will have to again look at synthetic absolute priorities and
coming up with a per browser mapping back to their "fixed" priority nodes.

Anyway, enough rambling.    My objections are not specifically at your 5
node idea.  I think it is a reasonable way to use the proposed mechanism.
Rather I see your proposed usage as a example of how the current mechanism
is not suitable for scalable servers.

Jetty is currently intending on ignoring priority.   We'd really like to
give a client it's data in priority order, but typically we have many
thousands of other clients to worry about and we are more concerned with
fair share and maximum throughput between clients rather than micro
managing the progress between requests from the same client.

cheers


















-- 
Greg Wilkins <gregw@intalio.com>  @  Webtide - *an Intalio subsidiary*
http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that scales
http://www.webtide.com  advice and support for jetty and cometd.
Received on Thursday, 8 January 2015 16:51:49 UTC