HTTP2 Expression of Interest

HTTP2 Expression of Interest - Rack

Good day,

This is the Rack teams HTTP2 Expression of Interest.

1. Introduction

Rack provides a minimal interface between web servers and web frameworks and
applications.

Rack is a Ruby based library for HTTP "glue", similar in purpose to WSGI. While
being a Ruby library, Rack has had a significant influence on many frameworks in
other languages, with ports and/or strongly related APIs existing in every major
and many minor languages in use today.

Rack itself is not a pure expression of HTTP, being strongly influenced by CGI
to date. Rack users have expressed strong interest in areas that are presently
difficult to implement in HTTP 1.1, and we hope that we can present many of
these opinions and challenges on behalf of our users.

Rack does not currently deploy any of the related protocols directly, although
there are projects that provide Rack compatible interfaces. There are
significant numbers of Rack deployments globally. Rack is very closely related
to HTTP, and with a suitable specification, should come to embody HTTP 2.0
semantics.

2. Criteria for HTTP/2.0

2.1 Application Transport Performance

While there is much focus on bandwidth specific performance optimizations, it is
an equally important requirement that server side performance be optimized. At
present, validating HTTP parsers that include parsing and validation for the
full subset of data that may be transported over HTTP, are extremely expensive.
It should further be noted that many common deployments in fact parse HTTP in
upward of 3 locations: Load balancing, Front-End servers, Back-End servers. The
parsing efficiency for each location can likely be optimized without drastic
impacts to other bandwidth oriented optimizations. In some cases, such as the
use of Multipart Form Data, this is in no small part due to the dynamically
framed nature of that sub-protocol. It is our desire that any future HTTP
standard eradicates the need for such sub-protocols where they are in extremely
common use. It is further desired that the replacements be strictly framed,
allowing for deterministic parser performance and static optimizations.

2.2 Application Semantics

In particular areas, HTTP 1.1 leaves guesswork to the implementors. Specifically
regarding pipelining and idempotency. We believe that these areas of the
specification should be relaxed and/or removed in order to clarify specific
intent and viable use cases. Such clarification should aim to increase uptake,
and increase the likelihood that any server or client implementation of the
protocol is indeed a full implementation of the protocol. The protocol should
include as few SHOULD or MAY keywords as possible, and further, should attempt
to include those keywords on minimal elements of the protocol. Usage of SHOULD
or MAY on large components of the protocol (such as Content-Length, Connection,
Encoding and related concepts) leads to compatibility problems. We believe that a
sufficiently lightweight chunked semantic for data bodies would be appropriate
for all modern use cases.

2.3 Encodings

HTTP 1.1 does not sufficiently specify character encodings throughout the
protocol, and this becomes problematic for application authors. The problems
become further invasive when spread to other sub-components, specifically
Multipart. In these areas we have observed significant variance among User
Agents and implementors that causes downstream problems for applications.
Declaring both encodings and escape semantics for each segment of the protocol
will greatly simplify these problems.

2.4 Files

We strongly recommend that Files, both in downloads and uploads, be handled
natively by the protocol. If the protocol is multiplexed then multi-file uploads
may be best handled by multiple requests with related identifiers. The current
Multipart based implementations are, as already described, problematic for a
great many reasons.

2.5 Trailing Headers

HTTP 1.1 provided support for Trailing Headers and while they may be useful for
certain data (e.g. Response metrics), they are barely supported. If any such
provision is provided by the new protocol, we would encourage their support
to be a MUST. In general, this requirement is easily handled by any protocol that
has a response completion message of some form, as the lack of such, has lead to
complexities with both Pipelining and Trailers in client and server
implementations of prior HTTP versions.

2.6 Security and Resilience

HTTP is presently a very simple protocol. Despite the simplicity of the
protocol, it has proven to be only somewhat resilient to certain attacks.
Specifically DDoS attacks can be very difficult to combat for many of our users.
We are a little concerned that some of the authors of the existing proposals
have a disproportionate volume of resources to handle such problems, and as such
are less concerned with complex protocol semantics. In contrast, many of our
users will have significant struggles with the plethora of potential new
conditions under which attacks might operate. All areas of the transport are
very much more complex, from the raw wire data, right through to the application
level memory state, which must also be both more complex and more sizable.

2.7 Authentication

Improved session and authentication management is important, however, this may
or may not be easily solved within the realms of HTTP. It is our observation
that there is a direct consumer conflict between Authentication and Anonymity,
and that until this social issue is better understood, solutions may struggle.
There has been some discussion around the concept of specifically splitting
these two use cases at the protocol level: one sub-protocol for authenticated
streams, and one sub-protocol for anonymous streams. The interactions between
such streams, particularly at the hyperlink level, would still require some
careful thought and guidance in order to be effective. It should be noted that
this approach may have UX advantages on both the consumer side, and for the
application author.

2.8 Usable defaults

Several of the specifications that have been proposed include large volumes of
work focused on improving the bandwidth performance of the Header section. It is
our opinion that a selection of common defaults can help significantly with this
problem. Character sets, content encodings, accepted encodings, and so on. It
might be well worth considering SHOULD clauses that defined recommended values
for these fields and SHOULD NOT clauses recommending their omittance when those
defaults are acceptable.

2.9 Debugging, Learning and the Power of ASCII

Many of the "original protocols" (or at least, early enough to be part of the
internet explosion), are ASCII protocols. This is not because they're efficient
or inefficient, it is because they are simple. In the global scheme of
debugging, it is not that uncommon for consumers of the protocol to write the
protocol by hand at some point. This will simply no longer be possible for a
binary protocol. While that alone may not be a reason for failure, it may be
worth considering that one maybe should be able to compose a protocol message
from a simple stream of commands. If the protocol is so complex as to require a
complex state machine at both ends, it is likely too complex for most people to
reason about. This would be damaging in the long run, and potentially even for
immediate term uptake, particularly if there are early security concerns.

2.10 Server Push

There are already a great deal of complexities within the realms of handling IO
multiplexed at the very high level that HTTP 1.x are. Adding push capabilities
to the native request/response protocol further muddies these waters, and the
concepts of prioritization are almost insurmountable for many general purpose
implementations. In our experience it is best to handoff these more complex
requirements to dedicated protocols that are better suited to the task. These
protocols may be handled on using dedicated code that can be independently
verified and controlled.

3. HTTP/2.0 Proposals

The commonality of bandwidth focus in these proposals, rather than application
level focus is somewhat alarming. Bandwidth and latency are improving
significantly for all Internet users (with the exception of Buffer Bloat
problems), and as such it seems that HTTPs weaknesses are not really in this
area. The lack of a progressive proposal that merely aims to address some of the
core application semantic problems in HTTP 1.1 makes the choice of a preferred
baseline difficult. Indeed it is in fact our preference to use HTTP 1.1 as a
baseline for HTTP 2.0, and to address semantic and framing problems within the
scope of the simple ASCII protocol, than to add significantly to runtime
state complexity.

3.1 draft-mbelshe-httpbis-spdy-00

SPDY is an interesting protocol from the point of view of its potential
capabilities as a generic application data transport protocol. It certainly
addresses many of the concerns of very large deployments. Primary concerns with
SPDY are that it is much too complex, and this poses many risks:
 * Security
 * Scalability for small installations
 * Resilience to poor clients
 * Resilience to poor servers
 * Complexity for debugging

The forced lower casing of header segments is advisable from an application
semantic standpoint, as it prevents the need for continual coercive construction
of reply headers, and case agnostic lookup of request headers.

Due to the above complexities, it is hard to recommend SPDY as a baseline. It
may be well suited for deployments with large expert operational staff, however,
it is mirrored to be equally bad for small non-expert staff in rest-of-the-world
deployment scenarios.

3.2 draft-montenegro-httpbis-speed-mobility-02

There appears to be very little difference in complexity between SPDY and
Speed+Mobility. While initially charming, the concept of reusing the Websockets
protocol seems to merely remove many of the advantages that SPDY offers,
specifically at connection setup time. While there is still significant
advantage for web applications, there is questionable advantage for web sites.
When the target is for a major change in the protocol, the wrapping within prior
versions seems to be a large amount of overhead.

3.3 draft-tarreau-httpbis-network-friendly-00

Marginally more preferred as a starting point from the three available, this
protocol proposal appears to have fewer controlling elements. This simplicity is
the primary reason for the preference. Entity frames of this nature may be able
to be adapted to also handle multiple files with trailers, in a simple linear
state machine, requiring less per session resources than SPDY. The frame header
indicating trailers is potentially problematic for some applications. If the
protocol had a response termination concept, such a concept would potentially
allow middleware to add trailers without application knowledge. Grouped header
fields are an interesting concept, but may be subject to many subtle security
complexities. There is still a complexity with request pipelining for the
purposes of multiplexed file uploads, and as mentioned above, it is preferred to
avoid multipart to work around such restrictions.

As with Speed+Mobility, the HTTP/1.1 upgrade path may be useful to some clients,
but ideally need not be preferred. It is possible that client applications might
encode http2 as a concept that could avoid this requirement, and a protocol
should take advantage of this to avoid extending the lifetime of HTTP 1.1
parsers into the next few decades.

4. Apology

Apology for the late entry and related somewhat disconnected, direct and less
positive summary reviews. We are very interested to see what comes out of the
working group, and will be watching (and implementing) with keen eyes.



Thank you for you time reading,

James Tucker

Received on Monday, 16 July 2012 04:50:57 UTC