Re: http/2 & hpack protocol review from Yoav Nir on 2014-05-05 (ietf-http-wg@w3.org from April to June 2014)

From: Yoav Nir <ynir.ietf@gmail.com>
Date: Mon, 5 May 2014 17:50:52 +0300
To: K.Morgan@iaea.org
Cc: ietf-http-wg@w3.org, C.Brunhuber@iaea.org
Message-Id: <2C1EA49B-A0DF-41E5-B436-8CC2B00FF6A5@gmail.com>
Hi, Keith.

I agree with the author that the HTTP/2 draft was driven by the needs of browser vendors and large hosting providers. It is not an ideal protocol for passing small blobs of data from A to B, but then neither is HTTP/1.1.

I mostly agree with the list of things that the devices don’t need, except for the unsolicited pushing of messages. That seems useful for many of the use cases I’ve heard about for the IoT, and in fact is one of the features of CoAP.

I don’t believe that HTTP/2 would become an optimal protocol for the needs of the devices even with the profile. At best it will be barely acceptable, much like HTTP/1.  This is by no means unique. Neither version of HTTP is ideal for downloading 1.5GB operating system updates either, but they’re both good enough that we don’t look for more appropriate protocols. 

I don’t know much about the Internet of Things and their requirements. But I don’t think it’s a good idea to have all data transfer on the Internet happen over the same 5-th layer protocol. The IETF has make CoAP for the use of such devices. Maybe it is appropriate, maybe it is not. And HTTP is certainly easy to deploy because there’s an HTTP library everywhere. But making HTTP fit everybody’s purpose would make it more complex, not less, because all endpoints would have to support the new profile (otherwise we get interop failures).

I guess what I think is that HTTP/2 should be optimized for “the web”. Other use cases should either settle for the not-quite-optimal HTTP/2 (like the OS update download) or use a more appropriate protocol, which could be HTTP/1.x or something else.

Yoav

On May 5, 2014, at 3:47 PM, K.Morgan@iaea.org wrote:

> Below is a review of HTTP/2 and HPACK that I received on a private mailing list.  With permission of the author, I am forwarding it to the WG.  I am interested to know what you guys think of the author’s suggestions.  -keith
>  
> ...
>  
> HTTP2 and the Internet of Things
>  
> Someone recently pointed me to the HTTP2 specification, specifically draft- ietf-httpbis-http2-12.  My first reaction after a quick scan through it was "let me guess, this was driven entirely by browser vendors and large hosting providers".  HTTP2 adds, and changes, a large number of things that have been causing problems for web hosters and browser users, but seems to completely ignore the Internet of things.
>  
> By Internet of things I don't mean the buzzword, but the staggering number of devices that use HTTP as a universal substrate for getting data from A to B.
> No-one knows how many things are out there, but it's likely to be far, far higher than the number of browser users and web hosting providers (depending on who you ask and what source you use, the figure seems to be around 10-20 billion devices, so it's also significantly larger than the number of people on the planet).
>  
> These devices not only don't need stream multiplexing, flow control, prioritisation, unsolicited pushing of messages, Huffman-encoding of headers, and full orchestration and five part harmony, but don't have the resources to implement any of it.  Alternatively, if they do try and implement all of the complexity introduced by HTTP 2, they'll invariably do the absolute minimum required to allow whatever browser the developers are using for testing to connect, and no more, since the developers' goal is to create a functioning Internet-of-things device and not a Google-scale web service.
>  
> In fact it's going to be impossible to create something on the level proposed by the HTTP 2 draft since we're talking about devices running on the likes of Cortex M3s, PICs, MSP430s, and ATmegas, which neither need, nor have the resources to implement, most of what's in the HTTP 2 draft.  The only thing that these devices need, and can support, is a substrate for moving a blob of data from A to B.  While something like this is indeed buried in HTTP 2 under layers and layers and layers and layers of complexity, screaming to get out, there doesn't seem to be any way to easily use it as such.
>  
> So it looks like HTTP 2 really needs (at least) two different profiles, one for web hosting/web browser users ("HTTP 2 is web scale!") and one for HTTP- as-a-substrate users.  The latter should have (or more accurately should *not*
> have) multiple streams and multiplexing, flow control, priorities, reprioritisation and dependencies, mandatory payload compression, most types of header compression, and many others.
>  
> Now some of this can be faked (e.g. by setting SETTINGS_HEADER_TABLE_SIZE = 0, SETTINGS_ENABLE_PUSH = 0, SETTINGS_MAX_CONCURRENT_STREAMS = 1, SETTINGS_INITIAL_WINDOW_SIZE = 2^31-1, SETTINGS_COMPRESS_DATA = 0), but a lot of it can't, and in any case it shouldn't be necessary to change a whole pile of parameters just to get a basic HTTP service.
>  
> What would be needed for the Internet-of-things profile is:
>  
> * A single stream, identifier = 1.  HTTP 2 reserves stream 0 for control
>   information which is why stream 1 is used, but for this profile it's assumed
>   that, functionally, stream 1 == stream 0, so a stream error or explicit
>   close (HEADERS flag END_STREAM or RST_STREAM frame) will close all streams.
>  
> * The only allowed frames are HEADERS and DATA, and specifically a single
>   HEADERS frame followed by one or more DATA frames, with no trailer frames.
>   There are no special-case frames like PING (man, this is reinventing IP over
>   HTTP), PUSH_PROMISE, WINDOW_UPDATE (now it's reinventing TCP over HTTP),
>   PRIORITY (to this end, the PRIORITY flag in a HEADER should always be
>   clear), GOAWAY, CONTINUATION (to this end, the END_HEADERS flag in a HEADER
>   should always be set), ALTSVC, BLOCKED, or KITCHENSINK.
>  
> * No requirement to support Gzip.
>  
> * No header compression apart from the use of indexing (the terminology is
>   confusing here, section 4.3.1 calls the use of an index "indexing", sections
>   4.3.2 and 4.3.3 call the same use of an index "without indexing", I'm going
>   to assume that if an index field is present then it's called "indexed" even
>   if the text says the index is "without indexing"). 
>   
>   In other words name:value pairs are always sent in one of three forms:
>  
>   - An index into the static table (section 4.2).
>  
>   - An index into the static table with attached value (section 4.3.2, literal
>     header without indexing, indexed name).
>  
>   - The "new name" format if no appropriate table entry exists (section 4.3.2,
>                literal header without indexing, new name).
>               
>                (As an aside, what's the difference between "Literal Header Field without
>                Indexing", described as "A literal header field without indexing causes
>                the emission of a header field without altering the header table", and
>                "Literal Header Field never Indexed", described as "A literal header field
>                never indexed causes the emission of a header field without altering the
>                header table"?).
>  
>   Any data is sent as octets rather than Huffman encoding, i.e. the 'H' flag
>   is always 0.
>  
>   (Good grief, the spec for compression is almost as long as the spec for the
>   rest of HTTP 2.  Apart from the unworkable complexity on limited devices and
>   the fact that it looks like a -00 draft (confusion over what "indexing"
>   means, duplication of 4.3.2 in 4.3.3), we're going to be patching vulns in
>   implementations of this for the next twenty years).
>  
> This takes the enormous implementation complexity of HTTP 2 and profiles it for the Internet of things, allowing for the existing widespread use of HTTP- as-a-substrate, which is everything that's needed, and indeed possible, for vast numbers of devices.
>  
> As an aside, how much of what's in the HTTP 2 draft is based on empirical data?  I know that the design of HTTP 1.1 was based on a large number of conference papers and other publications spread over many years that looked at HTTP issues and evaluated performance, but it's not so clear that HTTP 2 has had the same amount of evaluation.  In particular it's implementing stacked TCP, which is never a good idea (see "Why TCP Over TCP Is A Bad Idea",http://sites.inka.de/~W1011/devel/tcp-tcp.html, or the discussion of the "SSH Channel Handbrake" starting athttp://www.ietf.org/mail-archive/web/tls/current/msg03363.html), and then adding even more complexity like prioritisation and dependencies and weighting and reprioritisation.  Are the things in HTTP 2 based on any hard data - and by this I mean something more than just 'we measured SPDY on Google's servers and it was better' - or are they, if you'll excuse the expression, just a bunch of Google engineers hacking around?
>  
>  
> This email message is intended only for the use of the named recipient. Information contained in this email message and its attachments may be privileged, confidential and protected from disclosure. If you are not the intended recipient, please do not read, copy, use or disclose this communication to others. Also please notify the sender by replying to this message and then delete it from your system.
>
Received on Monday, 5 May 2014 14:51:25 UTC