Concerns about HTTP/2 Priority from Chad Austin on 2014-11-03 (ietf-http-wg@w3.org from October to December 2014)

From: Chad Austin <caustin@gmail.com>
Date: Sun, 2 Nov 2014 22:19:37 -0800
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CA+dRvWiMXJszfq3DB=3Hz_0b0x2rrjd-ePxa62vh27=uH4JwfQ@mail.gmail.com>
Hi!

Sorry if you're tired of hearing about priority.  :)  Having spent some
time working out the pros, cons, and implications of the tree-based
priority model, I have serious concerns that shipping HTTP/2 priority as it
stands would be harmful.

In this email I will enumerate the disadvantages of the HTTP/2 priority
model that I see and propose a couple options.

*# The Advantage of a Stream Dependency Model for Prioritization*

There is an advantage to using a DAG to specify priority: a DAG is a
natural and direct way to specify a partial ordering and does not require
the selection of arbitrary numeric values.  (e.g. HTML at priority 100, CSS
at priority 80, images at priority 20, etc.)

*# The Problems with HTTP/2 Priority*

*1. A tree cannot express some common priority schedules*

While the original SPDY/4 priority proposal [
https://groups.google.com/forum/#!topic/spdy-dev/-d9Auoun4HU/discussion ]
was a DAG, what got adopted in HTTP/2 is a tree.  A tree is not sufficient
to express some common priority schedules, like "all CSS and JS before any
images".

In HTTP/2, the only way to express "All JS/CSS before all images" is a
linked list [
http://lists.w3.org/Archives/Public/ietf-http-wg/2014OctDec/0164.html ],
which has the side effect of defining a strict ordering within the JS/CSS
group, implying that individual JS and CSS responses should be serialized
relative to each other.  For JS and CSS, finishing one response before
beginning the next is probably the right call* but there are asset types
where parallel transfer is beneficial (e.g. progressive images, VIPM 3D
meshes).  Thus, a linked list is less expressive than numeric priorities
here.

* though Ilya Grigorik argued for parallel high-priority resource transfer
here:
http://lists.w3.org/Archives/Public/public-whatwg-archive/2014Oct/0180.html
 I suppose the argument is that, given an unknown distribution of response
sizes (e.g. 100 KB JS, 1 KB CSS), it's better to transfer in parallel so
that small responses reliably complete before larger responses, avoiding
the scenario where a small CSS file is blocked by a much larger JS file.

*_Prebuttal: Weights do not solve for prioritization_*

A common misconception is that weights can be used to fine-tune
priorities.  In fact, the WIP Firefox implementation of HTTP/2 simply maps
Firefox's internal numeric priority values to HTTP/2 weights.  However,
weights and priorities are different concepts, and an accurate and complete
server-side implementation of HTTP/2 as drafted would allocate some
resources to low-weighted streams if they're at the same priority as
high-weighted streams.

Recently someone pushed this argument further, and said that if HTML/JS/CSS
had weight 256 and images had weight 1, then negligible bandwidth is
allocated to images until HTML/JS/CSS are complete...  for a small number
of images, that's true.  But, then, let's say you had high-priority images
and low-priority images?  You'd have to give HTML/JS/CSS weight=256,
medium-priority images weight=16, and low-priority images weight=1.
 (256/16 = 16/1).  If there are 16 times as many medium-priority images as
HTML/CSS/JS, fully half of the bandwidth will be allocated to the
medium-priority images as the high-priority HTML/JS/CSS.

My point is that using weights as priorities doesn't really work for
anything but the simplest scenarios.

*_Prebuttal: "Well, you can use a custom prioritization scheme with an
HTTP/2 protocol extension"_*

But, then, what's the point of shipping a standard if it requires, in some
common cases, a protocol extension?  Ideally applications, browsers,
akamai, nginx, varnish, etc. would all interoperate and requiring a
protocol extension makes that a lot harder.  This working group is the
appropriate place to solve for priority, and now is the appropriate time.

*2. Reprioritization is O(depth) and thus can be O(n)*

In HTTP/2, reprioritization is O(depth(new_parent) - depth(stream)) due to
the cycle check.  Thus, reprioritizing N nodes in a linked list is O(N^2).

While not the worst problem in the world, it's another subtle bit of
complexity that must be managed and protected against denial of service.

*3. I'm not sure HTTP/2 even solves the proxy use case*

When I asked why HTTP/2 specifies priority with tree dependencies, the
answer I was given was that it allows fairly multiplexing incoming HTTP/2
connections across a single backend HTTP/2 connection.  But does it?  [
http://lists.w3.org/Archives/Public/ietf-http-wg/2014OctDec/0309.html ]

Since HTTP/2 does not support dummy nodes in the priority tree (and I'll
explain later why that wouldn't even help), multiple top-level nodes in
each inbound stream would have to become a top-level weighted stream on the
outbound connection.  The weights ought to be allocated fairly, which means
any *additional* inbound stream has to fairly reweight other streams from
that connection.

Visually:

Inbound connection A: {A1, A2, A3} -> 0
Inbound connection B: {B1, B2} -> 0
Outbound connection: {A1[w=1/6],A2[w=1/6],A3[w=1/6],B1[w=1/4],B2[w=1/4]} ->
0

Now consider what happens when a new stream arrives from A.  Outbound A1,
A2, and A3 have to be reweighted.  Now consider what happens when B1
completes.  The proxy would have to notice that B1 is done, and B2 should
be reweighted to 1/2.  In the meantime, A is given a larger allocation of
backend resources.

I question whether stream dependencies is a realistic solution to the proxy
use case.  I suspect a more direct protocol for specifying stream groups
and weights would be simpler, more accurate, and more expressive.

In the *very least*, I would love somebody on this list to explain in
detail how the proxy use case is supposed to work.  I don't see it yet.
Osama Mazahir's explicit stream groups proposal is a more direct solution.

*4. There's no production implementation experience with stream dependency
priorities*

I think the issues I've enumerated above would be discovered when trying to
implement and profile HTTP/2 across browsers, CDNs, proxies, and so on.
But if HTTP/2 is destined to be a widespread standard, it might be too late
by then.

Was the theory that SPDY/4 would be the testing ground for these ideas, but
instead they got adopted in a limited form into HTTP/2?  Hard to say - I
haven't heard anything from Googlers since I started looking into this.

*# What now?*

I don't think HTTP/2 priority should ship as it is currently drafted.  What
are the other options?

*1. Ship HTTP/2 without priority and evolve priority as an extension.*

I'm not a fan of this option because priority is critically important.
Multiplexed streams without priority are slower than HTTP across N
connections.

*2. Replace the tree model with a DAG (multiple parents)*

A DAG, that is, allowing streams to have multiple parents, would solve the
"all HTML/JS/CSS in parallel, then all images in parallel" use case.
However, allowing multiple parents introduces a great deal of complexity
and chattiness into the protocol, including a potentially quadratic number
of bytes sent over the connection.  I don't recommend this path forward.

*3. Write a document describing how all of these use cases are supposed to
work*

Maybe I'm missing some key detail and my analysis above is completely
wrong.  If so, I'd love to see something that precisely describes how these
use cases are supposed to work.  How are browsers supposed to initiate
stream requests given existing pageload prioritization algorithms?  How
exactly are proxies supposed to work?

4. *Adopt Osama Mazahir's proposal*

In February, Osama Mazahir proposed an implementation of HTTP/2 priorities
that solves all of the use cases above.
http://lists.w3.org/Archives/Public/ietf-http-wg/2014JanMar/0396.html  For
a reason I haven't yet heard, it wasn't considered obviously better than
stream dependencies, and it lost in a coin flip in London:
http://msopentech.com/blog/2014/03/19/http2-nearing-completion/

I've heard one concern about numeric priorities: switching browser tabs
ought to reprioritize all existing requests.  This would cause N
reprioritization requests to be sent, where N is the number of active
requests.  However, I think that concern may be minor.  For almost all
realistic values of N, the frame(s?) to reprioritize N active streams could
fit in a single Ethernet MTU.

My preferred solution to fixing HTTP/2 priority is to adopt Osama's
Mazahir's proposal.  It's simple, direct, and browsers and SPDY currently
use numeric priorities anyway.

*# Why Priority is Critical*

Page load optimization is hitting diminishing returns.  Bandwidth is going
up, but connection latency is not really going down over time.  Priority is
a huge lever for improving page load experience by reducing round-trips and
making full use of the network pipe.  Getting priority right is critical --
HTTP/2 will rapidly become one of the most popular protocols in the world,
and getting priority right has enormous potential upside in latency and
efficiency.  I just don't see how the tree dependency model meets even
current use cases, which makes me think it's a bad idea to ship a protocol
based on it.

Thanks for reading,
Chad

--
Chad Austin
Technical Director, IMVU
http://chadaustin.me
Received on Monday, 3 November 2014 06:20:08 UTC