Re: HTTP/2 Expression of luke-warm interest: Varnish

On Mon, Jul 16, 2012 at 2:18 PM, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> In message <20120716204650.GI27010@1wt.eu>, Willy Tarreau writes:
>
>>I also remember that in private with a few of us you proposed a draft
>>that you never published. There were very valid concerns here. Why do
>>you not share it so that good ideas can be picked from it ?
>
> The main reason that it never made it to ID, is that there quite
> clearly is no room in the timeplan for the kind of fundamental
> contemplation I wanted to inspire, so it seemed a waste of both
> my own and everybody elses time.
>
> There's nothing secret about it, and I don't think there is anything
> in it which havn't already been mentioned in emails too:
>
>    http://phk.freebsd.dk/misc/draft-kamp-httpbis-http-20-architecture-01.txt

Thanks for sharing this.  Here are some thoughts on it, from my
perspective as a person who develops load balancers for a living:

My key takeaway from your draft is that you're advocating for HTTP/2.0
to have a wire format that can be parsed with efficiency much closer
to TCP (fixed-sized fields, binary values, both the code and the data
can fit in a small number of cache lines) than to HTTP/1.1
(variable-length fields with nontrivial delimiter semantics, ASCII
values, not-at-all-compact data, parsing code ends up being large and
full of conditional branches).

On one hand, that concept has a definite appeal.  In ever web server,
proxy, and load balancer on which I've ever worked, HTTP parsing has
been embarrassingly expensive.  On the other hand, that embarrassingly
expensive parsing cost still tends to be only a single-digit
percentage of total CPU usage in real-world proxies.  So is it
worthwhile to reduce that parsing cost?  Well, it depends on the
tradeoffs we have to make in the process.

Section 9.1 defines a very small set of message components that are
available to the "HTTP router" in an efficient representation.  An
HTTP router will have fast access to the requested path, the requested
hostname, and a session ID, plus everything in the L3 and L4 headers.
That's a proper subset of the things a load balancer can use to choose
a destination.  It's an adequate subset for some sites, but not for
others.  In my own experience working at websites ranging from tiny to
very large, the ones with very high traffic also had complicated load
balancing rules - e.g., routing traffic based on cookies that
persisted across sessions, routing based on query parameters, and even
routing based on custom headers inserted by a previous hop in a
multi-tier hierarchy.  In the terminology of section 9.2, the larger
sites needed "regular HTTP proxies."  In my terminology, they needed
load balancers.  In marketing terminology, they needed Application
Delivery Controllers.

But that's just my experience.  In general, the concept of
representing a few commonly-referenced message components in a
lightweight, cheap-to-parse format might be useful to people, and it's
something that could be added into a design like SPDY or HTTP
Speed+Mobility.

I think it would be illuminating to hear from some people who have
sites with 1) very high traffic (big enough to care about high-speed
request parsing and routing), 2) complicated enough load balancing
needs that they're doing L7 rather than L4 load balancing, and 3)
simple enough load balancing needs that the fields outlined in section
9.1 would suffice.  Those folks may have some good empirical data to
help show whether there's a general need for an "HTTP-router-friendly"
optimization in HTTP/2.0.

-Brian

Received on Tuesday, 17 July 2012 06:33:10 UTC