- From: Doug Beaver <doug@fb.com>
- Date: Tue, 17 Jul 2012 22:11:41 +0000
- To: "'Willy Tarreau'" <w@1wt.eu>
- CC: "'ietf-http-wg@w3.org'" <ietf-http-wg@w3.org>
Hi Willy, Thank you for your reply. I should note that Facebook's EOI was written by myself and Brian Pane; Brian is the lead engineer on our SPDY efforts and has been doing a lot of work to prepare our HTTP stack for multiplexed operation. Brian has thought more about some of your protocol questions than I have, so I think he'll step in and offer his thoughts. I have some thoughts on the encryption issue which I'll share below. These thoughts aren't the official position of Facebook, they are my own opinion. Hopefully my reply will answer most of the questions that have been raised on this thread though. > I don't want to start the encryption debate in this thread, but since you > have a fairly balanced approach, I'd like to note that at the moment, almost > 100% of stolen user information is been done on encryption-protected > services, whether it is bank account credentials or webmail credentials or > information. The issue always comes from malware running on the PC, > infecting the browser and stealing the information at the human interface. > However, users feel safer because they see the SSL lock. And it's not always > the browser, as there was a report of stolen webmail information in TLS > traffic in a certain country when a CA was broken and new certs for a number > of large sites were emitted. Here's my thoughts on mandating transport encryption: * The SSL/TLS/CA ecosystem is flawed, but it's the most widely deployed system we have for securing web traffic. We shouldn't let the flaws in the current system stop us from advocating for more user privacy; in fact greater pressure on the CA system from a crypto mandate would probably lead to reforms there. * SSL/TLS doesn't do a lot to address targeted attacks against one user (e.g. malware, spear-phishing, etc) but it helps guard against surveillance and censorship of large user populations. While there are weaknesses with the CA system that make it possible for governments and organizations to issue rogue certs in targeted cases, it is very difficult to deploy rogue certs globally for all web traffic for all users. Put simply, the more widely that SSL/TLS is used, the greater chance that users will have privacy in their online communications. * SSL/TLS stops "helpful" transparent proxies from intercepting your unencrypted traffic and doing things with it that you didn't ask them to. Encryption keeps these proxies honest, all they can do is choose whether or not to forward your connection. * Symmetric crypto costs are not much higher; I think Akamai quoted 10-20% in their response. I think the costs aren't a big deal for major sites; if you are large enough to care about performance, you are large enough to support session resumption, which cuts out the CPU cost of most handshakes. Rather, it's a much more interesting question for the very small operators and very small embedded devices. For example, if I have a thermostat in my fridge that wants to report temperature and power usage information somewhere central, it might be onerous to require it to speak crypto in order to talk to a web server today. I just think that it won't be onerous tomorrow. * Monetary cost of the certs is not an issue. You can get free (or cheap) DV certs, so hobbyist sites and non-profits would not be locked out from the web due to lack of access to cheap certs, and even if they were, I expect the market will produce a CA whose costs will meet that user demand. For EV cert prices, I expect that the market will continue to optimize those; there has been a steady decline in EV cert prices over the years. Either way, if you're large enough to require an EV cert, you have other infrastructure costs to bear as well (power, rent, hardware, network connectivity, domain registration, etc). The Internet has always been about the (mostly) free expression of ideas, both in terms of monetary cost and personal freedom in what you can say. I actually think it's pretty amazing that we have the Internet at all; if you look at all the civilizations that have existed in human history, not many of them would have built something that offered so much freedom for people to communicate with each other and publish their ideas and beliefs. However, we now live in a time where a sizable fraction of humanity uses the Web to communicate daily, and that makes those people a very lucrative target. I also think it is useful to understand why people object to the idea of mandated transport encryption as well. Critiques include the extra resource usage, the high cost of certs, the extra round trips, and the broken CA system (to name a few), while others react against the idea of forcing a mandate itself and thus having protocols push political agendas. I think all those positions can be valid. I just happen to think that even given all the above, it is still better to mandate encryption and give better privacy to Internet users than it is to punt the ball down the field another twenty years. > Also, you said that it could make things harder for you, but did you > evaluate only the front access or also the protocol used between your load > balancers and backend servers ? I'm asking because there is a difference > between mandating the use of encryption in browsers and designing the > protocol based on this. For instance, WebSocket has the masking flag > mandatory on upstream traffic but the protocol supports not having it > between servers. Regarding load balancers and resource usage, we looked at three cases: * Traffic between LB and user * Traffic between LB and LB * Traffic between LB and web server For the LB<->user case, you can remove a lot of handshakes with session resumption. We are seeing a 80% hit rate for our session caching deployment which supports server side session caching and client side session tickets. Plus, I would expect that multiplexed protocols hold onto their client sockets longer since you expect a higher probability of reusing that socket later (since you've gone from N connections to a given domain to just one connection). So you end up mostly just paying the symmetric cipher cost. The LB<->LB case happens when you terminate a user on an edge node close to them (say London) and tunnel their request to a remote datacenter. This speeds up the TCP and SSL handshakes since those happen over a low RTT link. If you advertise HTTPS capability to your users, it's obvious that you want to speak HTTPS between the LB in London and the end user in Liverpool. It is perhaps less obvious that you also need to encrypt the link between your edge LB in London and your datacenter in the US, since that traffic will travel over circuits leased from major carriers (unless you lay your own transoceanic fiber, and even then, there are techniques for tapping undersea traffic cables). For the LB<->LB case, you tend to use persistent connections, so the handshake cost is low. This is especially true with multiplexed protocols since you can fit so much traffic on those sockets; much more traffic than you're ever going to fit on a single LB<->user connection. So for these connections, you're mainly paying the cost of the block level cipher versus the more costly handshake. Once you get inside the datacenter, encryption is less important since you don't have to worry as much about third parties intercepting that traffic. Still, one would imagine that the datacenter load balancer would still probably keep persistent connections to the web servers it was balancing its load across and would enjoy similar handshake amortization as the other two cases. So I actually think that the sites with the largest request loads have the hardest time being against TLS on the resource usage front, especially when you look at the low cost of running TLS on commodity hardware and the expected gains in hardware power over the coming years. I'm more concerned about very small devices that want to use HTTP to report usage statistics (thermostats, pressure monitors, industrial sensors, etc). They might not be able to afford the hardware power to perform TLS. I think any personal communication devices (feature phones, smart phones, tablets, laptops, desktops, etc) will always have enough CPU to handle crypto, or will have onboard ASICs that they can offload that crypto to. > Basically, since all sensible sites already make use of TLS, I don't think > we can make them safer by mandating use of TLS for them. However mandating > use of TLS will make it harder to work on the backend, it will very often be > a counter- productive effort which increases costs a lot (cert managing, > troubleshooting, etc) with no added benefit. Requiring TLS definitely makes backend work more difficult but I think the tools will come. A side-effect would probably be that web servers and load balancers would get better instrumentation. I think Varnish has a nice implementation with its shared memory ring buffer that's used to log events; you can attach tools to that region and read the events in realtime as the proxy operates. The model is quite good. Regarding added benefit, it's true that many major sites are forcing HTTPS already. However, many are not, and there's also a very long tail of unencrypted sites out there. I argue that the added benefit is quite large. > I think that what you're describing here precisely is what WebSocket offers, > but I may be wrong, depending on your precise use-cases. It implicitly > offers server push in the sense you're describing it (push of any data, not > HTTP objects), and automatically offers the no-buffering flag because when > HTTP gateways switch to WebSocket, they know this is interactive traffic and > stop buffering. I think your description confirms the need to unify the > transport layer to support both HTTP and WS at the same time in the same > connection. Brian Pane has thoughts on WebSocket and SPDY, and I think he can better comment here. Regards, Doug
Received on Tuesday, 17 July 2012 22:12:32 UTC