Re: Rechartering HTTPbis from Patrick McManus on 2012-01-26 (ietf-http-wg@w3.org from January to March 2012)

From: Patrick McManus <pmcmanus@mozilla.com>
Date: Thu, 26 Jan 2012 09:04:48 -0500
To: Willy Tarreau <w@1wt.eu>
Cc: ietf-http-wg@w3.org
Message-ID: <1327586688.2052.386.camel@ds9>
[this is a resend because of a address-must-be-subscribed-to-post filter. Apologies if it hits the list twice]

Lots of great ideas on the list - good to see!

On Wed, 2012-01-25 at 14:58 +0100, Willy Tarreau wrote:
> On Wed, Jan 25, 2012 at 08:54:20AM -0500, Patrick McManus wrote:
> > On Wed, 2012-01-25 at 09:12 +0000, Poul-Henning Kamp wrote:
> > > , for instance
> > >      by putting a blank line between them.
> > 
> > and another smuggling attack is born :)
> 
> Not necessarily if the places are mandatory. You must always have exactly
> 2 blank lines in a request and the problem is over.
> 

Of course, depending on what was added, you may receive that injected
data and interpret it as a correct request with exactly 2 blank lines
followed by a bunch of garbage (or perhaps even a second valid request
if the smuggle was particularly good). Things separated by newlines (or
CRLF, or maybe LFCR) are just not a robust enough design. (and of course
implementations should not allow injections of sentinels - but it
happens all the time which is reason enough to not base your framing
around them).

and of course users of text based protocols will insist on 'be liberal
in what you receive' and the problem is back on in full. Heck - see
https://bugzilla.mozilla.org/show_bug.cgi?id=719256 .. in that bug I'm
asked to be liberal in receiving a *negative content-length*.

The question users ask is not "is that disallowed by the spec", it is
"is there a way to possibly squint hard enough and see the data I want
to see - if so do it." we all know that.

The binary length delimiters used in SPDY (for header names, values, and
chunk lengths) are great because they are unambiguous as well as MUCH
cheaper to parse. I'm obviously working on clients now, but when I was
doing high volume intermediaries I spent way too many CPU cycles
scanning the input stream looking for various forms of newlines, colons,
quotation marks, and parsing request headers - I love what a quick
binary length index does to improve that problem, but the real advantage
is that the framing is much more resilient.

I'm unmoved by the inevitable arguments that its too hard to manage a
binary protocol in <language of your choice>. We already have serious
counter examples in node.js, python, and ruby.

here are a couple of other fun classes of bugs that we can avoid
repeating by being less expressive on such basic items as delimiters:

parser doesn't consider the infinite range of length values expressed as
text (I've seen this one a half dozen times over the years) --
http://services.netscreen.com/documentation/signatures/HTTP%3AOVERFLOW%
3ACHUNK-LEN-OFLOW.html

complicated chunk-length definition with extension lists leads to bugs
in parsers -- 
http://lists.ximian.com/pipermail/mono-bugs/2010-April/099471.html

But seriously, this is just a minor sideshow. My experience with SPDY's
header handling and message delimits has simply proven itself to me in
practice - cheap and reliable to work with and compression rates >= 90%
using small memory-friendly windows. It even has a mechanism for
connection-wide headers (the settings frame). Comapred to HTTP/1 it is a
real pleasure. Those bits are not the interesting part of the work in
front of us imo.

The interesting parts to me are transaction (re-)prioritization, push
semantics (and caching related issues), flow control, good rules around
ip pooling especially involving credentials, caching, bandwidth
utilization and congestion responsiveness (especially against real time
flows), and an upgrade/alternate-protocol mechanism from plaintext http.

-Patrick
Received on Thursday, 26 January 2012 14:05:36 UTC