Re: http/2 & hpack protocol review from Erik Nygren on 2014-05-05 (ietf-http-wg@w3.org from April to June 2014)

From: Erik Nygren <erik@nygren.org>
Date: Mon, 5 May 2014 11:50:34 -0400
To: Yoav Nir <ynir.ietf@gmail.com>
Cc: K.Morgan@iaea.org, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>, C.Brunhuber@iaea.org
Message-ID: <CAKC-DJjnfreEMzMbeiaL+iDMMQq7BDpk_3YaosAGgJccZOFhjQ@mail.gmail.com>
I do think that the idea of a light-weight HTTP/2 profile may make sense.
There are some aspects of HTTP/2 that potentially make life easier for
embedded devices.  For example, the more binary-oriented structure may make
life easier for embedded devices.  Similarly, some elements of the data
exchange may actually make life easier in an IoT world.  Multiple streams,
PING, the data framing, and explicit end-of-stream notification, and
footers may all be useful in some telemetry applications.  (While not a
low-resource application, I'm aware of at least one proprietary system that
uses SPDY framing for telemetry where many of those elements came in handy.)

Is anyone trying to implement a light-weight embedded client/server as part
of the interop testing?
It seems like it would be valuable to understand which elements are hard in
a reduced-resource profile
(eg, requiring gzip seems unfortunate for these cases) and which elements
are actually helpful (many
of the other framing elements).  Some of the flow-control even seems like
it could be useful in some
limited resource environments.

       Erik



On Mon, May 5, 2014 at 10:50 AM, Yoav Nir <ynir.ietf@gmail.com> wrote:

> Hi, Keith.
>
> I agree with the author that the HTTP/2 draft was driven by the needs of
> browser vendors and large hosting providers. It is not an ideal protocol
> for passing small blobs of data from A to B, but then neither is HTTP/1.1.
>
> I mostly agree with the list of things that the devices don’t need, except
> for the unsolicited pushing of messages. That seems useful for many of the
> use cases I’ve heard about for the IoT, and in fact is one of the features
> of CoAP.
>
> I don’t believe that HTTP/2 would become an optimal protocol for the needs
> of the devices even with the profile. At best it will be barely acceptable,
> much like HTTP/1.  This is by no means unique. Neither version of HTTP is
> ideal for downloading 1.5GB operating system updates either, but they’re
> both good enough that we don’t look for more appropriate protocols.
>
> I don’t know much about the Internet of Things and their requirements. But
> I don’t think it’s a good idea to have all data transfer on the Internet
> happen over the same 5-th layer protocol. The IETF has make CoAP for the
> use of such devices. Maybe it is appropriate, maybe it is not. And HTTP is
> certainly easy to deploy because there’s an HTTP library everywhere. But
> making HTTP fit everybody’s purpose would make it more complex, not less,
> because all endpoints would have to support the new profile (otherwise we
> get interop failures).
>
> I guess what I think is that HTTP/2 should be optimized for “the web”.
> Other use cases should either settle for the not-quite-optimal HTTP/2 (like
> the OS update download) or use a more appropriate protocol, which could be
> HTTP/1.x or something else.
>
> Yoav
>
> On May 5, 2014, at 3:47 PM, K.Morgan@iaea.org wrote:
>
> Below is a review of HTTP/2 and HPACK that I received on a private mailing
> list.  With permission of the author, I am forwarding it to the WG.  I am
> interested to know what you guys think of the author’s suggestions.  -keith
>
>
> ...
>
>
> HTTP2 and the Internet of Things
>
>
> Someone recently pointed me to the HTTP2 specification, specifically
> draft- ietf-httpbis-http2-12.  My first reaction after a quick scan through
> it was "let me guess, this was driven entirely by browser vendors and large
> hosting providers".  HTTP2 adds, and changes, a large number of things that
> have been causing problems for web hosters and browser users, but seems to
> completely ignore the Internet of things.
>
>
> By Internet of things I don't mean the buzzword, but the staggering number
> of devices that use HTTP as a universal substrate for getting data from A
> to B.
> No-one knows how many things are out there, but it's likely to be far, far
> higher than the number of browser users and web hosting providers
> (depending on who you ask and what source you use, the figure seems to be
> around 10-20 billion devices, so it's also significantly larger than the
> number of people on the planet).
>
>
> These devices not only don't need stream multiplexing, flow control,
> prioritisation, unsolicited pushing of messages, Huffman-encoding of
> headers, and full orchestration and five part harmony, but don't have the
> resources to implement any of it.  Alternatively, if they do try and
> implement all of the complexity introduced by HTTP 2, they'll invariably do
> the absolute minimum required to allow whatever browser the developers are
> using for testing to connect, and no more, since the developers' goal is to
> create a functioning Internet-of-things device and not a Google-scale web
> service.
>
>
> In fact it's going to be impossible to create something on the level
> proposed by the HTTP 2 draft since we're talking about devices running on
> the likes of Cortex M3s, PICs, MSP430s, and ATmegas, which neither need,
> nor have the resources to implement, most of what's in the HTTP 2 draft.
> The only thing that these devices need, and can support, is a substrate for
> moving a blob of data from A to B.  While something like this is indeed
> buried in HTTP 2 under layers and layers and layers and layers of
> complexity, screaming to get out, there doesn't seem to be any way to
> easily use it as such.
>
>
> So it looks like HTTP 2 really needs (at least) two different profiles,
> one for web hosting/web browser users ("HTTP 2 is web scale!") and one for
> HTTP- as-a-substrate users.  The latter should have (or more accurately
> should *not*
> have) multiple streams and multiplexing, flow control, priorities,
> reprioritisation and dependencies, mandatory payload compression, most
> types of header compression, and many others.
>
>
> Now some of this can be faked (e.g. by setting SETTINGS_HEADER_TABLE_SIZE
> = 0, SETTINGS_ENABLE_PUSH = 0, SETTINGS_MAX_CONCURRENT_STREAMS = 1,
> SETTINGS_INITIAL_WINDOW_SIZE = 2^31-1, SETTINGS_COMPRESS_DATA = 0), but a
> lot of it can't, and in any case it shouldn't be necessary to change a
> whole pile of parameters just to get a basic HTTP service.
>
>
> What would be needed for the Internet-of-things profile is:
>
>
> * A single stream, identifier = 1.  HTTP 2 reserves stream 0 for control
>   information which is why stream 1 is used, but for this profile it's
> assumed
>   that, functionally, stream 1 == stream 0, so a stream error or explicit
>   close (HEADERS flag END_STREAM or RST_STREAM frame) will close all
> streams.
>
>
> * The only allowed frames are HEADERS and DATA, and specifically a single
>   HEADERS frame followed by one or more DATA frames, with no trailer
> frames.
>   There are no special-case frames like PING (man, this is reinventing IP
> over
>   HTTP), PUSH_PROMISE, WINDOW_UPDATE (now it's reinventing TCP over HTTP),
>   PRIORITY (to this end, the PRIORITY flag in a HEADER should always be
>   clear), GOAWAY, CONTINUATION (to this end, the END_HEADERS flag in a
> HEADER
>   should always be set), ALTSVC, BLOCKED, or KITCHENSINK.
>
>
> * No requirement to support Gzip.
>
>
> * No header compression apart from the use of indexing (the terminology is
>   confusing here, section 4.3.1 calls the use of an index "indexing",
> sections
>   4.3.2 and 4.3.3 call the same use of an index "without indexing", I'm
> going
>   to assume that if an index field is present then it's called "indexed"
> even
>   if the text says the index is "without indexing").
>
>
>   In other words name:value pairs are always sent in one of three forms:
>
>
>   - An index into the static table (section 4.2).
>
>
>   - An index into the static table with attached value (section 4.3.2,
> literal
>     header without indexing, indexed name).
>
>
>   - The "new name" format if no appropriate table entry exists (section
> 4.3.2,
>                literal header without indexing, new name).
>
>
>                (As an aside, what's the difference between "Literal Header
> Field without
>                Indexing", described as "A literal header field without
> indexing causes
>                the emission of a header field without altering the header
> table", and
>                "Literal Header Field never Indexed", described as "A
> literal header field
>                never indexed causes the emission of a header field without
> altering the
>                header table"?).
>
>
>   Any data is sent as octets rather than Huffman encoding, i.e. the 'H'
> flag
>   is always 0.
>
>
>   (Good grief, the spec for compression is almost as long as the spec for
> the
>   rest of HTTP 2.  Apart from the unworkable complexity on limited devices
> and
>   the fact that it looks like a -00 draft (confusion over what "indexing"
>   means, duplication of 4.3.2 in 4.3.3), we're going to be patching vulns
> in
>   implementations of this for the next twenty years).
>
>
> This takes the enormous implementation complexity of HTTP 2 and profiles
> it for the Internet of things, allowing for the existing widespread use of
> HTTP- as-a-substrate, which is everything that's needed, and indeed
> possible, for vast numbers of devices.
>
>
> As an aside, how much of what's in the HTTP 2 draft is based on empirical
> data?  I know that the design of HTTP 1.1 was based on a large number of
> conference papers and other publications spread over many years that looked
> at HTTP issues and evaluated performance, but it's not so clear that HTTP 2
> has had the same amount of evaluation.  In particular it's implementing
> stacked TCP, which is never a good idea (see "Why TCP Over TCP Is A Bad
> Idea",http://sites.inka.de/~W1011/devel/tcp-tcp.html, or the discussion
> of the "SSH Channel Handbrake" starting at
> http://www.ietf.org/mail-archive/web/tls/current/msg03363.html), and then
> adding even more complexity like prioritisation and dependencies and
> weighting and reprioritisation.  Are the things in HTTP 2 based on any hard
> data - and by this I mean something more than just 'we measured SPDY on
> Google's servers and it was better' - or are they, if you'll excuse the
> expression, just a bunch of Google engineers hacking around?
>
>
>
>
>
> This email message is intended only for the use of the named recipient.
> Information contained in this email message and its attachments may be
> privileged, confidential and protected from disclosure. If you are not the
> intended recipient, please do not read, copy, use or disclose this
> communication to others. Also please notify the sender by replying to this
> message and then delete it from your system.
>
>
>
Received on Monday, 5 May 2014 15:51:08 UTC