Re: HTTP/2 Expression of luke-warm interest: Varnish from Roberto Peon on 2012-07-17 (ietf-http-wg@w3.org from July to September 2012)

From: Roberto Peon <grmocg@gmail.com>
Date: Tue, 17 Jul 2012 00:50:36 -0700
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: HTTP Working Group <ietf-http-wg@w3.org>, Brian Pane <brianp@brianp.net>
Message-ID: <CAP+FsNetP9uhjrNXwXkJgZNGXMbbj0U8WfJfbyHK6jBF0TnKrw@mail.gmail.com>
On Jul 17, 2012 12:10 AM, "Poul-Henning Kamp" <phk@phk.freebsd.dk> wrote:
>
> In message <
CAAbTgTs78uxYKuc0LFDLG1M+yq4xS2eL1UAL13dtO9g3rHsq2A@mail.gmail.com>
> , Brian Pane writes:
>
> >My key takeaway from your draft is that you're advocating for HTTP/2.0
> >to have a wire format that can be parsed with efficiency much closer
> >to TCP [...]
>
> Yes, at least for the routing envelope.
>
> The metadata and content should probably be squeezed as hard as we
> reasonably can.
>
> >Section 9.1 defines a very small set of message components that are
> >available to the "HTTP router" in an efficient representation.
>
> This is one of the places where I've changed my mind a fair bit
> since I wrote this, in particular with respect to URI, and cookies,
> but the general idea still holds.
>
> >e.g., routing traffic based on cookies that
> >persisted across sessions, routing based on query parameters, and even
> >routing based on custom headers inserted by a previous hop in a
> >multi-tier hierarchy.
>
> (Cookies are gone in my vision of HTTP/2.0 by now, instead you'll
> have a client created and supplied session identifier.)
>
> Yes, there are times and places where you need more detailed
> routing, and if so, you'll have to pay the cost of taking it
> all apart.
>
> But a very large fraction of all HTTP-routing is by "Host:" header
> only, and if you add URI and session-id to that, I wouldn't be
> surprised if you have covered 90% of all HTTP routing based on the
> request.  (Obviously a lot of routing happens on the basis of server
> availability, but that is unaffected by the protocol.)
>
> There is a significant gain to be had for all routing, also the
> more complex kind you mention, if the multiplex-ID/stream-ID can
> double as a flow-label for routing, so the router only have to
> examine the first request on each stream, and let all subsequent
> requests follow the same decision.
>
> One further optimization from there, could be to only send the
> envelope when you open a new stream/mux-channel, but I'm not
> sure if that would be a net pessimization or optimization.
>
> >I think it would be illuminating to hear from some people who have
> >sites with 1) very high traffic (big enough to care about high-speed
> >request parsing and routing),

I have experience with such. Mapping URL space and determining routing is a
huge pain in the posterior, but the problem is more social than
technological. :/

Parsing the same crud over and over again (and, more specifically doing the
memory management associated with such) is the most bothersome component at
this layer. Being informed only when things change is probably a better
idea and one which will be fun and interesting to pursue.

>
> I'd much rather ask a VHDL/Verilog wizard: "How would you
> route HTTP requests at 1Tbit/s in an FPGA ?"

The base problem isn't the technology-- I bet that we could get something
in software going at line rate, limited mainly by bus speeds if the
configuration space for the mapping (call it route discovery) were simple
enough.

The holdup is that users have bookmarks, external links, etc. and so sites
are reasonably reluctant to change their (unfortunately complex and
potentially order dependent rule) mappings when doing so might lose them
traffic.
Ignoring bookmarks and other links, it is still problematic to get a site
to change the mappings because they'll have to support both the HTTP/2
stuff and the http/1 stuff. They're not likely to do it. It would take a
massive effort to make such a thing happen.

It gets worse when one considers pieces of hardware which are not
upgradable and have url-space hardwired into their firmware. One must wait
until most such devices and appliances have reached their end-of-life, and
pray that none have replaced it using similar designs with different fixed
url-space.

So, the problem is that we're stuck with this for the foreseeable future, I
believe.

We could define a new approach that was useful separately by encoding some
other, non-human readable information into the URL or equivalent, but... we
could already do that but don't.

If we wish to improve this part of the problem, the question is why we
don't bother to use mechanisms that already enabled with today's
tech/feature set.
Any ideas as to how to reprogram site designers/authors? :)

>
> The routable envelope is meant to be a hedge for the future: We
> might be able to route 10 or even 40Gbit/s on todays silicon, but
> if we want to do it to 1Tbit/s on tomorrows silicon, I am sure need
> to make the task easier, because the silicon isn't getting faster,
> it's just getting wider.
>
> But as I hope is clear:  This was intended as food for long-range
> thought, not as a concrete proposal.
>
> --
> Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
> phk@FreeBSD.ORG         | TCP/IP since RFC 956
> FreeBSD committer       | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by
incompetence.
>
Received on Tuesday, 17 July 2012 07:51:10 UTC