Re: http/2 & hpack protocol review from James M Snell on 2014-05-05 (ietf-http-wg@w3.org from April to June 2014)

From: James M Snell <jasnell@gmail.com>
Date: Mon, 5 May 2014 08:26:37 -0700
To: K.Morgan@iaea.org
Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, C.Brunhuber@iaea.org
Message-ID: <CABP7RbfdW=BfLLN3eE-Vak5J88eDRtip3E+t1VWt_UT209sFFQ@mail.gmail.com>
First and foremost, it needs to be recognized that HTTP/2 has been
designed from the start to primarily meet the needs of a very specific
grouping of high volume web properties and browser implementations.
There is very little evidence that ubiquitous use of the protocol is
even a secondary consideration -- in fact, the "they can just keep
using HTTP/1.1" mantra has been repeated quite often throughout many
of the discussions here on this, usually as a way of brushing aside
many of the concerns that have been raised. So be it. It's clear at
this point that HTTP/2 is on a specific fixed path forward and that,
for the kinds of use cases required by IoT, alternatives will need to
be pursued.

FWIW, I am strongly -1 on the use of hpack in http/2. I believe that
there are (and have proposed/argued for) much less complicated
alternatives.

- James

On Mon, May 5, 2014 at 5:47 AM,  <K.Morgan@iaea.org> wrote:
> Below is a review of HTTP/2 and HPACK that I received on a private mailing
> list.  With permission of the author, I am forwarding it to the WG.  I am
> interested to know what you guys think of the author’s suggestions.  -keith
>
>
>
> ...
>
>
>
> HTTP2 and the Internet of Things
>
>
>
> Someone recently pointed me to the HTTP2 specification, specifically draft-
> ietf-httpbis-http2-12.  My first reaction after a quick scan through it was
> "let me guess, this was driven entirely by browser vendors and large hosting
> providers".  HTTP2 adds, and changes, a large number of things that have
> been causing problems for web hosters and browser users, but seems to
> completely ignore the Internet of things.
>
>
>
> By Internet of things I don't mean the buzzword, but the staggering number
> of devices that use HTTP as a universal substrate for getting data from A to
> B.
>
> No-one knows how many things are out there, but it's likely to be far, far
> higher than the number of browser users and web hosting providers (depending
> on who you ask and what source you use, the figure seems to be around 10-20
> billion devices, so it's also significantly larger than the number of people
> on the planet).
>
>
>
> These devices not only don't need stream multiplexing, flow control,
> prioritisation, unsolicited pushing of messages, Huffman-encoding of
> headers, and full orchestration and five part harmony, but don't have the
> resources to implement any of it.  Alternatively, if they do try and
> implement all of the complexity introduced by HTTP 2, they'll invariably do
> the absolute minimum required to allow whatever browser the developers are
> using for testing to connect, and no more, since the developers' goal is to
> create a functioning Internet-of-things device and not a Google-scale web
> service.
>
>
>
> In fact it's going to be impossible to create something on the level
> proposed by the HTTP 2 draft since we're talking about devices running on
> the likes of Cortex M3s, PICs, MSP430s, and ATmegas, which neither need, nor
> have the resources to implement, most of what's in the HTTP 2 draft.  The
> only thing that these devices need, and can support, is a substrate for
> moving a blob of data from A to B.  While something like this is indeed
> buried in HTTP 2 under layers and layers and layers and layers of
> complexity, screaming to get out, there doesn't seem to be any way to easily
> use it as such.
>
>
>
> So it looks like HTTP 2 really needs (at least) two different profiles, one
> for web hosting/web browser users ("HTTP 2 is web scale!") and one for HTTP-
> as-a-substrate users.  The latter should have (or more accurately should
> *not*
>
> have) multiple streams and multiplexing, flow control, priorities,
> reprioritisation and dependencies, mandatory payload compression, most types
> of header compression, and many others.
>
>
>
> Now some of this can be faked (e.g. by setting SETTINGS_HEADER_TABLE_SIZE =
> 0, SETTINGS_ENABLE_PUSH = 0, SETTINGS_MAX_CONCURRENT_STREAMS = 1,
> SETTINGS_INITIAL_WINDOW_SIZE = 2^31-1, SETTINGS_COMPRESS_DATA = 0), but a
> lot of it can't, and in any case it shouldn't be necessary to change a whole
> pile of parameters just to get a basic HTTP service.
>
>
>
> What would be needed for the Internet-of-things profile is:
>
>
>
> * A single stream, identifier = 1.  HTTP 2 reserves stream 0 for control
>
>   information which is why stream 1 is used, but for this profile it's
> assumed
>
>   that, functionally, stream 1 == stream 0, so a stream error or explicit
>
>   close (HEADERS flag END_STREAM or RST_STREAM frame) will close all
> streams.
>
>
>
> * The only allowed frames are HEADERS and DATA, and specifically a single
>
>   HEADERS frame followed by one or more DATA frames, with no trailer frames.
>
>   There are no special-case frames like PING (man, this is reinventing IP
> over
>
>   HTTP), PUSH_PROMISE, WINDOW_UPDATE (now it's reinventing TCP over HTTP),
>
>   PRIORITY (to this end, the PRIORITY flag in a HEADER should always be
>
>   clear), GOAWAY, CONTINUATION (to this end, the END_HEADERS flag in a
> HEADER
>
>   should always be set), ALTSVC, BLOCKED, or KITCHENSINK.
>
>
>
> * No requirement to support Gzip.
>
>
>
> * No header compression apart from the use of indexing (the terminology is
>
>   confusing here, section 4.3.1 calls the use of an index "indexing",
> sections
>
>   4.3.2 and 4.3.3 call the same use of an index "without indexing", I'm
> going
>
>   to assume that if an index field is present then it's called "indexed"
> even
>
>   if the text says the index is "without indexing").
>
>
>
>   In other words name:value pairs are always sent in one of three forms:
>
>
>
>   - An index into the static table (section 4.2).
>
>
>
>   - An index into the static table with attached value (section 4.3.2,
> literal
>
>     header without indexing, indexed name).
>
>
>
>   - The "new name" format if no appropriate table entry exists (section
> 4.3.2,
>
>                literal header without indexing, new name).
>
>
>
>                (As an aside, what's the difference between "Literal Header
> Field without
>
>                Indexing", described as "A literal header field without
> indexing causes
>
>                the emission of a header field without altering the header
> table", and
>
>                "Literal Header Field never Indexed", described as "A literal
> header field
>
>                never indexed causes the emission of a header field without
> altering the
>
>                header table"?).
>
>
>
>   Any data is sent as octets rather than Huffman encoding, i.e. the 'H' flag
>
>   is always 0.
>
>
>
>   (Good grief, the spec for compression is almost as long as the spec for
> the
>
>   rest of HTTP 2.  Apart from the unworkable complexity on limited devices
> and
>
>   the fact that it looks like a -00 draft (confusion over what "indexing"
>
>   means, duplication of 4.3.2 in 4.3.3), we're going to be patching vulns in
>
>   implementations of this for the next twenty years).
>
>
>
> This takes the enormous implementation complexity of HTTP 2 and profiles it
> for the Internet of things, allowing for the existing widespread use of
> HTTP- as-a-substrate, which is everything that's needed, and indeed
> possible, for vast numbers of devices.
>
>
>
> As an aside, how much of what's in the HTTP 2 draft is based on empirical
> data?  I know that the design of HTTP 1.1 was based on a large number of
> conference papers and other publications spread over many years that looked
> at HTTP issues and evaluated performance, but it's not so clear that HTTP 2
> has had the same amount of evaluation.  In particular it's implementing
> stacked TCP, which is never a good idea (see "Why TCP Over TCP Is A Bad
> Idea", http://sites.inka.de/~W1011/devel/tcp-tcp.html, or the discussion of
> the "SSH Channel Handbrake" starting at
> http://www.ietf.org/mail-archive/web/tls/current/msg03363.html), and then
> adding even more complexity like prioritisation and dependencies and
> weighting and reprioritisation.  Are the things in HTTP 2 based on any hard
> data - and by this I mean something more than just 'we measured SPDY on
> Google's servers and it was better' - or are they, if you'll excuse the
> expression, just a bunch of Google engineers hacking around?
>
>
>
>
>
> This email message is intended only for the use of the named recipient.
> Information contained in this email message and its attachments may be
> privileged, confidential and protected from disclosure. If you are not the
> intended recipient, please do not read, copy, use or disclose this
> communication to others. Also please notify the sender by replying to this
> message and then delete it from your system.
Received on Monday, 5 May 2014 15:27:27 UTC