Re: websocket HTTP response parsing

On Mon, 7 Jul 2008, Julian Reschke wrote:
> > 
> > We need a handshake that can guarantee that we've contacted a 
> > WebSocket server and not some other server that's being tricked into 
> > sending data that looks like the handshake. For that purpose, we use 
> > the headers that come before the server has to echo anything, and we 
> > make that as long as possible (and as unique as possible) by moving 
> > both those headers there.
> 
> I don't understand this point. Could you (or somebody else) please 
> elaborate?

Which part don't you understand, the need for a handshake, the requirement 
that the handshake not be one that an existing server can be tricked into 
sending, or the description of the currently defined handshake?


> > > > > Also, did you consider the impact of intermediates in the 
> > > > > request path?
> > > >
> > > > You mean, like proxies? Sure, the spec defines how to handle 
> > > > those.
> > >
> > > What if they somehow modify the reason phrase? It would be outside 
> > > the control of the server, and communication would break.
> > 
> > Any proxy that did that would be deeply violating HTTP rules and would 
> > also be breaking TLS tunnels.
> 
> Which HTTP rule would be violated by a proxy rewriting the reason 
> phrase?

Wouldn't it be a violation of CONNECT semantics?

It seems to me that if the way of handling proxies defined in HTML5 for 
Web Socket doesn't work, then there's no way TLS could work either. Is 
that wrong? How would TLS work if the tunnel wasn't opaque?


> Let's just agree that we disagree on this.

I'm having this discussion because I assume there's a problem in the spec 
that needs fixing. Please, don't raise points that don't need fixing; 
there are literally thousands of e-mails still to reply to and we'll never 
resolve everything if people are raising things that don't need changing!


> > It seems better to use whatever HTTP uses in its descriptions than go 
> > ahead and make up our own stuff for no reason. If we had a good reason 
> > to use something else then sure, but here we really don't. Would you 
> > rather have the status text be something else, like "101 WebSocket 
> > Handshake"?
> 
> Actually yes -- because it would ensure that somebody deploying this 
> actually has control over the reason phrase, which may not always be 
> true.

Ok. Fixed.


> I'd prefer though to use HTTP semantics (making the text of the reason 
> phrase irrelevant), or to use an empty reason phrase.

Making it irrelevant would make it easier to smuggle this over another 
protocol, which is what we're trying to avoid. It would also increase the 
odds of bugs in clients.

Using an empty reason phrase just seems ugly.


> > > > Why is using the HTTP spec's recommended text a problem?
> > > The HTTP spec doesn't recommend any specific text. The reason phrase 
> > > is for human consumption only. For instance, there are servers known 
> > > to vary it by Accept-Language or locale.
> > 
> > The section defining code 101 is "10.1.2 101 Switching Protocols". So 
> > that's what we use here. Seems reasonable to me.
> 
> It's one reasonable string, but again (and again...): the HTTP spec does 
> not give it any semantics, and servers may vary in what they send, for 
> instance because of the locale of server, or Accept-Language.

Well, now they have a particular handshake string they have to send for 
this case.


> > > There's simply no point in putting it in, so why do it?
> > 
> > This seems like an unbelievably minor point of basically no importance 
> > whatsoever. Am I missing some key way in which this actually matters? 
> > If not, then it seems pointless to be arguing over it.
> 
> In addition to the reasons I gave (and that you don't seem to care 
> about), it's a strange way to design a protocol by making a plain 
> english text fragment part of the exchange.

As opposed to the rest of the English stuff like "HTTP" and "Web" and 
"Socket" and "Connection" and "Upgrade" and so on?


> - the reason phrase has no semantics in HTTP, so it's a bad idea to make 
> something look like HTTP but then make the exact byte sequence in the 
> reason phrase important

Why? You keep saying it's a bad idea, but it seems like a good thing to 
me. It makes the handshake more resilient. Why does it cause a problem?


> - servers/libraries may not allow sending arbitrary reason phrases

Servers/libraries will have to be upgraded to handle WebSocket anyway, so 
this isn't a big deal.


> - servers/libraries may vary reason phrase based on locale information, 
> so you may be testing something in the US, and it works, and later it 
> will fail when the software is deployed somewhere else

The failure mode is fatal here, so that's fine; it would be caught early. 
Indeed, now that we have a hard-coded unique string, implementors would 
have to go out of their way to translate it, so this seems unlikely.


> - intermediates could rewrite the reason phrase

Given the semantics described in the spec, I don't see how this is 
possible.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Monday, 7 July 2008 09:25:52 UTC