Re: Feedback on WebSocket API, Editor's Draft 13 November 2009.

On Tue, 8 Dec 2009, Sebastian Andersson wrote:
> On Sat, Dec 5, 2009 at 09:53, Ian Hickson <ian@hixie.ch> wrote:
> > On Fri, 4 Dec 2009, Sebastian Andersson wrote:
> >> How would the policy file and the service not be under the same control?
> >
> > In a shared hosting environment, typically port 80 is a virtual hosted 
> > HTTP server, and the ports above 1024 are under the control of the 
> > first user to bind to it. So ports 1024 and 1025 can be under the 
> > control of different users. If the policy file is served from port 
> > 1024, it can allow script on port 80 from one virtual host to access 
> > the service on port 1025 intended for scripts of pages of another 
> > virtual host on port 80.
> 
> Just like the firewall would have to be opened up for the new service's 
> port, so would the policy file have to be updated. That is a job of the 
> service provider, not of any of the users and as I already wrote, that 
> is an administrative issue, not a technical one.

Right now, today, if I were to expose a WebSocket service on my Dreamhost 
shared server, I could do so without any trouble. If we used a scheme such 
as the one described above using a policy file, anyone else also hosting 
their site on the same server could grant access to my server to their 
scripts on their page. Whether this is a technical or administrative 
issue, it's an issue we have to handle today.


> >> > If we assume a round-trip time of 120ms, that means that opening a 
> >> > WebSocket connection takes 360ms rather than the 120ms it takes 
> >> > with the security model in the spec today. That's a lot of extra 
> >> > latency.
> >>
> >> One could also do both connections at the same time, but not open the 
> >> socket to the application until the policy file has been read. Or 
> >> simply use a cache.
> >
> > That seems more complex than necessary.
> 
> It would probably be among the simplest of the code in a browser that is 
> able to render html5 and run javascript.

Sure, but that doesn't mean it's not more complex than necessary.


> >> >> I don't know if the current player caches the result, but that 
> >> >> could be added.
> >> >
> >> > Then you increase the race condition I mentioned from being merely 
> >> > a few milliseconds to being whatever the time-to-live of the policy 
> >> > file is.
> >>
> >> Since I fail to see the attack scenario, I fail to see the race 
> >> condition.
> >
> > The race condition is a separate issue. The race condition is what 
> > happens when the policy changes between the time that the policy file 
> > is read and the connection is established. Consider for instance the 
> > window between an error being found in a policy file and the policy 
> > being fixed. If the policy has a 24 hour time-to-live, then the site 
> > is automatically vulnerable for up to 24 hours for some users.
> 
> I still don't see an attack scenario being described here. Yes, 
> administrative changes will take some time to propagate when caches are 
> used if the object has been cached. Just like DNS, some CDNs, reverse 
> proxies etc., that is hardly something new for an administrator.
> 
> The policy file is only one of many access control mechanisms and even 
> if it is incorrectly written, it would still take special malicious code 
> to create a vulnerability, the firewall would have to allow access to 
> the port

There's no firewall in a shared hosting environment.


> and the service's access control mechanisms would have to allow the 
> connection and the browser would somehow get to run the malicious code 
> from an origin that was listed in the policy file.

If we weren't worried about malicious code from an origin that was 
(mistakenly) listed in the policy file, then we wouldn't have a problem. 


> Of course there is an opportunity, but is it a big risk?

IMHO, yes. I understand that security is a tradeoff between risk and 
reward, but fundamentally, it seems to me that it is far off the 
"risk/reward" scale to design a system in such a way that a security 
problem cannot be fixed when it is discovered. Having the user agent cache 
the security policy and act on that cached version even after a security 
problem has been found prevents security problems from being fixed in a 
timely manner.


> >> >>> What's wrong with the way WebSocket does it?
> >> >>
> >> >> Many custom protocols are based on a length field followed by a 
> >> >> protocol frame. With WebSocket it is possible to connect to such a 
> >> >> service and be connected for a while until a timeout happens and 
> >> >> thus a DoS attack against such a service would be trivial to 
> >> >> write.
> >> >
> >> > Interesting. Do you have any examples of such services that I could 
> >> > study? If we could mitigate that that would be very useful. Are 
> >> > these services that send no traffic until they've received the 
> >> > first packet?
> >>
> >> MS-RPC and CIFS are such I believe.
> >
> > Interesting. I shall study these, thanks.

I spoke to one of the Samba developers about this issue. He pointed out 
that such attacks with Web Sockets are already possible with unscripted 
HTML pages today, since any page, even unscripted, can include an <img> 
element that points to a CIFS or MS-RPC port, which will result in a 
similar HTTP request being sent and the server not returning any data 
until the "fake" CIFS or MS-RPC packet terminates (which it never will, as 
you point out). He said that in his opinion, that was a problem that CIFS 
and MS-RPC implementors simply would have to deal with themselves, 
regardless of Web Sockets.

His analysis seems correct. I have therefore not changed the protocol.


> >> At least one can send the example header from the RFC to them without 
> >> being disconnected nor getting an answer.
> >
> > We may be able to do something about MS-RPC and CIFS specifically, but 
> > it's not clear that it's possible to have a general solution for the 
> > problem of servers that don't respond immediately, since in general 
> > they're indistinguishable from a slow network.
> 
> The general solution is to have an opt-in system like flash's policy 
> files.

This is not a possible solution if, as you suggest above, we send both 
requests simultaneously. It's also not possible in general, since as HTTP 
is also able to be used for this attack, one can simply tell the UA that 
the policy file is on the victim port, and have _that_ stage DOS the 
server instead of the actual Web Socket connection.


> But let's take a realistic scenario. Let's assume we want to build an 
> IRC client as a web application.
> 
> With TCP and flash-policy files, the web client can connect to the IRC 
> server, just like all other clients. The channel operators can see the 
> IP number's of the connected clients and kick/ban someone's IP number if 
> they are there to cause grief (which I've heard happens quite often). 
> These functions exists today and work quite well.
> 
> With the WebSocket protocol, there are two ways to implement such a 
> system. A proxy could be written. The channel operators would all see 
> the same IP number (the proxy's) and can't tell which client is causing 
> grief if he reconnects. There would have to be extra logic to the proxy 
> to allow it to report the real IP number to the IRC network. There would 
> have to be an access filter in the proxy that channel operators can 
> control. Sounds like a complex solution and one that would be a bit 
> expensive. I'm quite sure there will be plenty of implementations there 
> the IP number will not be passed on to the original service, leading to 
> higher administrative costs (two logs to read, to access lists to keep 
> in synch, two ports in the firewall to manage in the same way).
> 
> Another way would be to extend the IRC server with a second port that
> understands the WebSocket protocol. That is, given that the source is
> available (that is quite likely given IRC servers, but quite often
> unavailable for other services). Doesn't sound too cheap either.

Yes, that is true.

This is also not what Web Socket is designed to address, so it's not 
surprising that it can't handle this case. It also isn't a good protocol 
for implementing a generic document retrieval system, or peer-to-peer data 
transfer, or doing video multicast. These are not failings, per se, 
they just weren't use cases that Web Sockets was designed for.


> I expect the same two scenarios would come up in most cases where you 
> want to use a web client against an existing tcp service.

Absolutely. Web Sockets was specifically designed to make it impossible to 
connect to an existing TCP service, in fact, specifically to avoid 
situations such as poor policy file configurations allowing an SMTP 
server to be hijacked for sending spam.


> The WebSocket protocol services would also be vulnerable to malicious 
> code and instead of just having one implementation in each web browser 
> that matches the origin with the port to see if a connection is allowed, 
> that code would have to be added to each WebSocket protocol service and 
> perhaps by less skilled programmers than the browser developers.

I don't understand the attack scenario here. Can you elaborate?


> The WebSocket protocol has a higher cost and it seems to me that the 
> risks are equal, but different. Given that flash already uses TCP and 
> policy files, the risk of reusing it would not increase since I don't 
> think flash is going away soon.

Flash is one of the main reasons I avoided policy files. While the 
vulnerabilities may not have been widely advertised, Flash policy files 
have in fact been the subject of _numerous_ attack scenarios.

Two widely-reported cases are:
   http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-6243
   http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-4822

It's probably possible to make a secure implementation of this strategy. 
It's just not as simple or as easy as the direct approach Web Sockets 
uses, IMHO.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Saturday, 30 January 2010 09:32:20 UTC