http/2 & hpack protocol review from K.Morgan@iaea.org on 2014-05-05 (ietf-http-wg@w3.org from April to June 2014)

From: <K.Morgan@iaea.org>
Date: Mon, 5 May 2014 12:47:17 +0000
To: <ietf-http-wg@w3.org>
CC: <C.Brunhuber@iaea.org>
Message-ID: <0356EBBE092D394F9291DA01E8D28EC2011298855A@sem001pd.sg.iaea.org>

Below is a review of HTTP/2 and HPACK that I received on a private mailing list. With permission of the author, I am forwarding it to the WG. I am interested to know what you guys think of the author's suggestions. -keith

...

HTTP2 and the Internet of Things

Someone recently pointed me to the HTTP2 specification, specifically draft- ietf-httpbis-http2-12. My first reaction after a quick scan through it was "let me guess, this was driven entirely by browser vendors and large hosting providers". HTTP2 adds, and changes, a large number of things that have been causing problems for web hosters and browser users, but seems to completely ignore the Internet of things.

By Internet of things I don't mean the buzzword, but the staggering number of devices that use HTTP as a universal substrate for getting data from A to B.

No-one knows how many things are out there, but it's likely to be far, far higher than the number of browser users and web hosting providers (depending on who you ask and what source you use, the figure seems to be around 10-20 billion devices, so it's also significantly larger than the number of people on the planet).

These devices not only don't need stream multiplexing, flow control, prioritisation, unsolicited pushing of messages, Huffman-encoding of headers, and full orchestration and five part harmony, but don't have the resources to implement any of it. Alternatively, if they do try and implement all of the complexity introduced by HTTP 2, they'll invariably do the absolute minimum required to allow whatever browser the developers are using for testing to connect, and no more, since the developers' goal is to create a functioning Internet-of-things device and not a Google-scale web service.

In fact it's going to be impossible to create something on the level proposed by the HTTP 2 draft since we're talking about devices running on the likes of Cortex M3s, PICs, MSP430s, and ATmegas, which neither need, nor have the resources to implement, most of what's in the HTTP 2 draft. The only thing that these devices need, and can support, is a substrate for moving a blob of data from A to B. While something like this is indeed buried in HTTP 2 under layers and layers and layers and layers of complexity, screaming to get out, there doesn't seem to be any way to easily use it as such.

So it looks like HTTP 2 really needs (at least) two different profiles, one for web hosting/web browser users ("HTTP 2 is web scale!") and one for HTTP- as-a-substrate users. The latter should have (or more accurately should *not*

have) multiple streams and multiplexing, flow control, priorities, reprioritisation and dependencies, mandatory payload compression, most types of header compression, and many others.

Now some of this can be faked (e.g. by setting SETTINGS_HEADER_TABLE_SIZE = 0, SETTINGS_ENABLE_PUSH = 0, SETTINGS_MAX_CONCURRENT_STREAMS = 1, SETTINGS_INITIAL_WINDOW_SIZE = 2^31-1, SETTINGS_COMPRESS_DATA = 0), but a lot of it can't, and in any case it shouldn't be necessary to change a whole pile of parameters just to get a basic HTTP service.

What would be needed for the Internet-of-things profile is:

* A single stream, identifier = 1. HTTP 2 reserves stream 0 for control

information which is why stream 1 is used, but for this profile it's assumed

that, functionally, stream 1 == stream 0, so a stream error or explicit

close (HEADERS flag END_STREAM or RST_STREAM frame) will close all streams.

* The only allowed frames are HEADERS and DATA, and specifically a single

HEADERS frame followed by one or more DATA frames, with no trailer frames.

There are no special-case frames like PING (man, this is reinventing IP over

HTTP), PUSH_PROMISE, WINDOW_UPDATE (now it's reinventing TCP over HTTP),

PRIORITY (to this end, the PRIORITY flag in a HEADER should always be

clear), GOAWAY, CONTINUATION (to this end, the END_HEADERS flag in a HEADER

should always be set), ALTSVC, BLOCKED, or KITCHENSINK.

* No requirement to support Gzip.

* No header compression apart from the use of indexing (the terminology is

confusing here, section 4.3.1 calls the use of an index "indexing", sections

4.3.2 and 4.3.3 call the same use of an index "without indexing", I'm going

to assume that if an index field is present then it's called "indexed" even

if the text says the index is "without indexing").

In other words name:value pairs are always sent in one of three forms:

- An index into the static table (section 4.2).

- An index into the static table with attached value (section 4.3.2, literal

header without indexing, indexed name).

- The "new name" format if no appropriate table entry exists (section 4.3.2,

literal header without indexing, new name).

(As an aside, what's the difference between "Literal Header Field without

Indexing", described as "A literal header field without indexing causes

the emission of a header field without altering the header table", and

"Literal Header Field never Indexed", described as "A literal header field

never indexed causes the emission of a header field without altering the

header table"?).

Any data is sent as octets rather than Huffman encoding, i.e. the 'H' flag

is always 0.

(Good grief, the spec for compression is almost as long as the spec for the

rest of HTTP 2. Apart from the unworkable complexity on limited devices and

the fact that it looks like a -00 draft (confusion over what "indexing"

means, duplication of 4.3.2 in 4.3.3), we're going to be patching vulns in

implementations of this for the next twenty years).

This takes the enormous implementation complexity of HTTP 2 and profiles it for the Internet of things, allowing for the existing widespread use of HTTP- as-a-substrate, which is everything that's needed, and indeed possible, for vast numbers of devices.

As an aside, how much of what's in the HTTP 2 draft is based on empirical data? I know that the design of HTTP 1.1 was based on a large number of conference papers and other publications spread over many years that looked at HTTP issues and evaluated performance, but it's not so clear that HTTP 2 has had the same amount of evaluation. In particular it's implementing stacked TCP, which is never a good idea (see "Why TCP Over TCP Is A Bad Idea", http://sites.inka.de/~W1011/devel/tcp-tcp.html, or the discussion of the "SSH Channel Handbrake" starting at http://www.ietf.org/mail-archive/web/tls/current/msg03363.html), and then adding even more complexity like prioritisation and dependencies and weighting and reprioritisation. Are the things in HTTP 2 based on any hard data - and by this I mean something more than just 'we measured SPDY on Google's servers and it was better' - or are they, if you'll excuse the expression, just a bunch of Google engineers hacking around?

This email message is intended only for the use of the named recipient. Information contained in this email message and its attachments may be privileged, confidential and protected from disclosure. If you are not the intended recipient, please do not read, copy, use or disclose this communication to others. Also please notify the sender by replying to this message and then delete it from your system.

Received on Monday, 5 May 2014 12:47:50 UTC