Re: #250 / #251 (connect bodies) from Willy Tarreau on 2010-10-28 (ietf-http-wg@w3.org from October to December 2010)

From: Willy Tarreau <w@1wt.eu>
Date: Thu, 28 Oct 2010 09:23:07 +0200
To: Mark Nottingham <mnot@mnot.net>
Cc: Adam Barth <w3c@adambarth.com>, Julian Reschke <julian.reschke@gmx.de>, Adrien de Croy <adrien@qbik.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20101028072307.GC20369@1wt.eu>
On Thu, Oct 28, 2010 at 06:06:26PM +1100, Mark Nottingham wrote:
> 
> On 28/10/2010, at 5:33 PM, Adam Barth wrote:
> >   REQ. 7:  The WebSocket protocol MUST allow HTTP and WebSocket
> >      connections to be served from the same port.
> > 
> > http://tools.ietf.org/html/draft-ietf-hybi-websocket-requirements
> 
> Presence in a requirements document doesn't guarantee that any given feature will get through. Considering the hoops you're having to jump through to meet this requirement (does the reqs document have consensus, by the way?), it's a wonder it's still considered relevant.
> 
> In other words, who actually *wants* this?

I'd pose the question the other way around : how could we replace existing
long-polling mechanisms that already work over HTTP and benefit from all
features that have been built around HTTP, on a different port. Right now,
virtual hosting, ability to be proxied any number of times, including through
URL filters (think about schools), session state management and server
stickiness are essential to web applications nowadays. If we're going to
use a distinct port, we're losing most of these existing features, which
will considerably lower adoption.

> > From CONNECT, specifically, we'd like to get WebSocket-oblivious
> > intermediaries to ignore the rest of the bytes on the socket.  CONNECT
> > lets them know that the rest of the bytes coming from the client
> > aren't going to be HTTP, so they shouldn't try to understand the
> > semantics of those bytes.
> 
> Explicitly configured forward proxies -- as opposed to "reverse" proxies, interception (a.k.a. "transparent") proxies, L7 load balancers, firewalls and the like -- will indeed do this. Whether any of the rest will is anyone's guess, and depends upon how they're coded, whether they're ever used as a forward proxy, and how generous / clued-up the administrator is. 

Pure reverse proxies will probably not work, just as they currently don't
work with 101 either. Firewalls and load balancers do support explicit
proxies (those were the first users of load balancers) and as such are
well aware of the methods used in such environments. There's still the
common issue of proxy-connection vs connection but there's already a bug
open on the subject.

> > If everyone perfectly implemented HTTP-as-specified, we could use
> > Upgrade for that purpose because that's what Upgrade is supposed to
> > mean also.  Unfortunately, Upgrade is extremely rare in
> > HTTP-as-she-be-spoke, which means many of these intermediaries think
> > that the subsequent bytes sent by the client have HTTP semantics.
> 
> How does that follow?
> 
> Upgrade is an optional mechanism. The client requests upgrade, and the server decides whether or not it's upgrading the protocol; if it does, it uses a 101 Switching Protocols response.
> 
> That, however, is predicated upon the presence of the Upgrade header in the request. Upgrade is hop-by-hop, which means that it'll be removed by well-behaved intermediaries, unless they understand the protocol being upgraded to. 

There are more stupid implementations. One of them was done by me. First
incarnations of haproxy did not consider the status codes and used to pass
them to the other side. However, I'd say that it did not support keep-alive.
Recently I fixed it for 1xx support and was contacted about the fact it broke
websocket (I considered 101 as equivalent of 100). This last specific case
is what saves us after all.

After saying that, I'm realizing that maybe we're seeing the issue the wrong
way :
  - if an implementation does not correctly handle 1xx, it cannot support
    keep-alive, so I don't see how we could have a security issue

  - if an implementation supports 101 as 100, it will break on the end of
    the handshake which won't look like an HTTP status anymore.

  - if it supports 101, then fine.

> For the remainder (i.e., the not-well-behaved intermediaries), in my testing setting the appropriate Connection: header catches pretty much all of them.
> 
> E.g.,
> 
> POST /foo HTTP/1.1
> Host: example.com
> Upgrade: WebSockets/5.0
> Connection: Upgrade
> 
> will catch almost every deployment (certainly more than could be characterised as "extremely rare").  The ones that aren't can be worked around, because they're only deployed server-side (e.g., Pound, nginx).

That was my observation too.

> >  Of
> > course, in WebSockets, the attacker is given great latitude to select
> > bytes-on-the-wire after the handshake completes, which means the
> > attacker's script running in a browser can craft bytes that look a lot
> > like HTTP but avoid the security invariants the browser usually
> > imposes on script code running on behalf of a web site.
> 
> You're missing the point here -- a box that isn't expecting and explicitly handling CONNECT will also think that HTTP semantics apply beyond the end of the message. 

But do we have any chance to spot any such implementation ? That was one of
my concern in the early days when I too proposed CONNECT, but if the
implementation is server-side only, it generally does not support CONNECT,
which is fine. And if it supports relaying to proxies it will support it
with correct semantics.

Regards,
Willy
Received on Thursday, 28 October 2010 07:23:45 UTC