- From: Chad Austin <caustin@gmail.com>
- Date: Sun, 2 Nov 2014 22:19:37 -0800
- To: HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CA+dRvWiMXJszfq3DB=3Hz_0b0x2rrjd-ePxa62vh27=uH4JwfQ@mail.gmail.com>
Hi! Sorry if you're tired of hearing about priority. :) Having spent some time working out the pros, cons, and implications of the tree-based priority model, I have serious concerns that shipping HTTP/2 priority as it stands would be harmful. In this email I will enumerate the disadvantages of the HTTP/2 priority model that I see and propose a couple options. *# The Advantage of a Stream Dependency Model for Prioritization* There is an advantage to using a DAG to specify priority: a DAG is a natural and direct way to specify a partial ordering and does not require the selection of arbitrary numeric values. (e.g. HTML at priority 100, CSS at priority 80, images at priority 20, etc.) *# The Problems with HTTP/2 Priority* *1. A tree cannot express some common priority schedules* While the original SPDY/4 priority proposal [ https://groups.google.com/forum/#!topic/spdy-dev/-d9Auoun4HU/discussion ] was a DAG, what got adopted in HTTP/2 is a tree. A tree is not sufficient to express some common priority schedules, like "all CSS and JS before any images". In HTTP/2, the only way to express "All JS/CSS before all images" is a linked list [ http://lists.w3.org/Archives/Public/ietf-http-wg/2014OctDec/0164.html ], which has the side effect of defining a strict ordering within the JS/CSS group, implying that individual JS and CSS responses should be serialized relative to each other. For JS and CSS, finishing one response before beginning the next is probably the right call* but there are asset types where parallel transfer is beneficial (e.g. progressive images, VIPM 3D meshes). Thus, a linked list is less expressive than numeric priorities here. * though Ilya Grigorik argued for parallel high-priority resource transfer here: http://lists.w3.org/Archives/Public/public-whatwg-archive/2014Oct/0180.html I suppose the argument is that, given an unknown distribution of response sizes (e.g. 100 KB JS, 1 KB CSS), it's better to transfer in parallel so that small responses reliably complete before larger responses, avoiding the scenario where a small CSS file is blocked by a much larger JS file. *_Prebuttal: Weights do not solve for prioritization_* A common misconception is that weights can be used to fine-tune priorities. In fact, the WIP Firefox implementation of HTTP/2 simply maps Firefox's internal numeric priority values to HTTP/2 weights. However, weights and priorities are different concepts, and an accurate and complete server-side implementation of HTTP/2 as drafted would allocate some resources to low-weighted streams if they're at the same priority as high-weighted streams. Recently someone pushed this argument further, and said that if HTML/JS/CSS had weight 256 and images had weight 1, then negligible bandwidth is allocated to images until HTML/JS/CSS are complete... for a small number of images, that's true. But, then, let's say you had high-priority images and low-priority images? You'd have to give HTML/JS/CSS weight=256, medium-priority images weight=16, and low-priority images weight=1. (256/16 = 16/1). If there are 16 times as many medium-priority images as HTML/CSS/JS, fully half of the bandwidth will be allocated to the medium-priority images as the high-priority HTML/JS/CSS. My point is that using weights as priorities doesn't really work for anything but the simplest scenarios. *_Prebuttal: "Well, you can use a custom prioritization scheme with an HTTP/2 protocol extension"_* But, then, what's the point of shipping a standard if it requires, in some common cases, a protocol extension? Ideally applications, browsers, akamai, nginx, varnish, etc. would all interoperate and requiring a protocol extension makes that a lot harder. This working group is the appropriate place to solve for priority, and now is the appropriate time. *2. Reprioritization is O(depth) and thus can be O(n)* In HTTP/2, reprioritization is O(depth(new_parent) - depth(stream)) due to the cycle check. Thus, reprioritizing N nodes in a linked list is O(N^2). While not the worst problem in the world, it's another subtle bit of complexity that must be managed and protected against denial of service. *3. I'm not sure HTTP/2 even solves the proxy use case* When I asked why HTTP/2 specifies priority with tree dependencies, the answer I was given was that it allows fairly multiplexing incoming HTTP/2 connections across a single backend HTTP/2 connection. But does it? [ http://lists.w3.org/Archives/Public/ietf-http-wg/2014OctDec/0309.html ] Since HTTP/2 does not support dummy nodes in the priority tree (and I'll explain later why that wouldn't even help), multiple top-level nodes in each inbound stream would have to become a top-level weighted stream on the outbound connection. The weights ought to be allocated fairly, which means any *additional* inbound stream has to fairly reweight other streams from that connection. Visually: Inbound connection A: {A1, A2, A3} -> 0 Inbound connection B: {B1, B2} -> 0 Outbound connection: {A1[w=1/6],A2[w=1/6],A3[w=1/6],B1[w=1/4],B2[w=1/4]} -> 0 Now consider what happens when a new stream arrives from A. Outbound A1, A2, and A3 have to be reweighted. Now consider what happens when B1 completes. The proxy would have to notice that B1 is done, and B2 should be reweighted to 1/2. In the meantime, A is given a larger allocation of backend resources. I question whether stream dependencies is a realistic solution to the proxy use case. I suspect a more direct protocol for specifying stream groups and weights would be simpler, more accurate, and more expressive. In the *very least*, I would love somebody on this list to explain in detail how the proxy use case is supposed to work. I don't see it yet. Osama Mazahir's explicit stream groups proposal is a more direct solution. *4. There's no production implementation experience with stream dependency priorities* I think the issues I've enumerated above would be discovered when trying to implement and profile HTTP/2 across browsers, CDNs, proxies, and so on. But if HTTP/2 is destined to be a widespread standard, it might be too late by then. Was the theory that SPDY/4 would be the testing ground for these ideas, but instead they got adopted in a limited form into HTTP/2? Hard to say - I haven't heard anything from Googlers since I started looking into this. *# What now?* I don't think HTTP/2 priority should ship as it is currently drafted. What are the other options? *1. Ship HTTP/2 without priority and evolve priority as an extension.* I'm not a fan of this option because priority is critically important. Multiplexed streams without priority are slower than HTTP across N connections. *2. Replace the tree model with a DAG (multiple parents)* A DAG, that is, allowing streams to have multiple parents, would solve the "all HTML/JS/CSS in parallel, then all images in parallel" use case. However, allowing multiple parents introduces a great deal of complexity and chattiness into the protocol, including a potentially quadratic number of bytes sent over the connection. I don't recommend this path forward. *3. Write a document describing how all of these use cases are supposed to work* Maybe I'm missing some key detail and my analysis above is completely wrong. If so, I'd love to see something that precisely describes how these use cases are supposed to work. How are browsers supposed to initiate stream requests given existing pageload prioritization algorithms? How exactly are proxies supposed to work? 4. *Adopt Osama Mazahir's proposal* In February, Osama Mazahir proposed an implementation of HTTP/2 priorities that solves all of the use cases above. http://lists.w3.org/Archives/Public/ietf-http-wg/2014JanMar/0396.html For a reason I haven't yet heard, it wasn't considered obviously better than stream dependencies, and it lost in a coin flip in London: http://msopentech.com/blog/2014/03/19/http2-nearing-completion/ I've heard one concern about numeric priorities: switching browser tabs ought to reprioritize all existing requests. This would cause N reprioritization requests to be sent, where N is the number of active requests. However, I think that concern may be minor. For almost all realistic values of N, the frame(s?) to reprioritize N active streams could fit in a single Ethernet MTU. My preferred solution to fixing HTTP/2 priority is to adopt Osama's Mazahir's proposal. It's simple, direct, and browsers and SPDY currently use numeric priorities anyway. *# Why Priority is Critical* Page load optimization is hitting diminishing returns. Bandwidth is going up, but connection latency is not really going down over time. Priority is a huge lever for improving page load experience by reducing round-trips and making full use of the network pipe. Getting priority right is critical -- HTTP/2 will rapidly become one of the most popular protocols in the world, and getting priority right has enormous potential upside in latency and efficiency. I just don't see how the tree dependency model meets even current use cases, which makes me think it's a bad idea to ship a protocol based on it. Thanks for reading, Chad -- Chad Austin Technical Director, IMVU http://chadaustin.me
Received on Monday, 3 November 2014 06:20:08 UTC