Re: Drastically cutting primary features [was Re: Last call for public comments on Web Crypto charter] from Henry Story on 2011-12-01 (public-identity@w3.org from December 2011)

From: Henry Story <henry.story@bblfish.net>
Date: Thu, 1 Dec 2011 09:48:24 +0100
To: Mitch Zollinger <mzollinger@netflix.com>
Cc: <public-identity@w3.org>
Message-Id: <A1C0F0AC-8CD8-4553-B8A5-9FE901003870@bblfish.net>
On 25 Nov 2011, at 08:19, Mitch Zollinger wrote:

> Apologies for coming late to this discussion. Mark Watson was kind enough to point my attention to this email thread & as I'm out on vacation (Happy Thanksgiving!) I haven't jumped in as soon as I should have.
> 
> After reading through the thread, I see two issues that I would like to address:
> 
> 1. TLS. Can't TLS do everything that is needed for a secure protocol?
> 2. We shouldn't try to add in a full complement of crypto APIs because this is hard and produces a mess like the JCE, PKCS#11, or some other <insert your favorite here> complex, hard to understand set of APIs.
> 
> I'll take these in order:
> 
> 1. TLS
> 
> We've spent the last 4 years finding that a secure protocol without TLS is a Really Good Thing for our use cases. I can break the issues down into two man categories: operational issues and performance issues.
> 
> Operational:
> * When using TLS as a security model, you have to manage a trust store. For anyone that has done this for any amount of time, you know that CAs change their root certs, CAs issue subordinate CA certs, and CAs are compromised. (Ask Comodo.) Managing the trust store "securely" leads to the need to constrain the certs contained in the store and/or (usually "and" if you really want to be secure) add CRLs & OCSP into the mix. We've had TLS failures on our devices because of these operational issues which are out of our control. (Example: a CDN decides to change CA provider and doesn't tell us.)
> * As referenced above, you need to manage CRLs & OCSP to do things right. We're working with a CDN partner right now that hosts their CRLs on a server in Europe which sometimes doesn't answer HTTP requests to get access to the CRL. What do you do? If you're "secure" you fail the SSL handshake (which error case the app may never see, as it's deep in the networking code.) and if you ignore the CRL retrieval failure & go ahead anyway, you've compromised yourself in the worst case and slowed things down in the best case (see below performance issues for more on this.)
> * Time. All SSL certs have validity periods (NotBefore & NotAfter values). When a CA issues a cert, on 12/25/2011, the NotBefore value is 12/25/2011. When an embedded device (think about that new LED flat screen TV "under" the Christmas tree) first comes up, it doesn't know what time it is. In fact, most of these devices don't even have battery backed clocks! So, if I plug in that new TV on Christmas day and the firmware has a "birth date" of 6/1/2011, the SSL handshake will fail without any sort of user visible error. (We're not in a "real" web browser that will pop up a dialog to complain about the cert.) What a disappointing user experience.
> * When running on embedded devices, just the flash space & cache you need to maintain for CA certs, CRLs and OCSP responses sometimes pushes you into a place where device manufacturers balk.
> 
> Performance:
> * Assuming you get through all the issues above (which we have) you'll find out that when you want a really high performance user experience, it's just not going to happen in many cases.
> * CRL / OCSP retrieval & response issues. As mentioned above, we have a CRL distribution point managed by a major CA provider, used by a major CDN that simply fails to respond sometimes. Let's say for the sake of argument, the thing fails for 5% of all request during peak Netflix viewing hours. That means that if I watch movies & TV during peak hours, 1 in 20 times I use my device I will actually hit my socket timeout value (1 minute on a lot of devices). I'm going to sit twiddling my thumbs wondering why things are so slow and this will happen non-deterministically.
> * The above is a worst case (that we've seen) real world issue. But even in the "normal" case, climbing the X.509 certificate chain to validate an SSL server cert usually involves several calls out to CRL distribution points and OCSP responders. For some devices where the CAs are managed by 3rd parties, a 5-10 second SSL handshake is not unusual. In the case of Netflix, we want startup to happen in a second or less (imagine if we were BETTER than digital cable. That's a worthwhile goal, yes?) and using TLS means we can't get there.


I think many of these can be solved with the IETF DANE work or something similar http://tools.ietf.org/wg/dane/ - if they can ever get finished, even though it is really something very simple to understand conceptually.
  
Essentially the Web suffers from a problem even more fundamental than TLS: DNS is falling apart and there is really only one solution - move to DNSSEC, which has already happened for most DNS providers I believe. Once that has happened you can put server certificates into DNS, which is a distributed database - so you remove the centralisation problems, and the responsiveness issues.  You also no longer have the issue of the worse CA poisoning everybody else, since you have at least your country (domain) to fallback on - the problem is partitioned. You can also avoid CAs completely and move to self signed certificates in DNs.

> 
> 2. Crypto APIs
> 
> We're also flummoxed by "standard" APIs like PKCS#11, the JCE, OpenSSL and others. Crypto can be hard if you try to create abstractions for every single type of cryptographic primitive and every single type of cryptographic operation. (Ever tried to create common APIs for RSA, ECC & DSA? Oh wait, DSA can't do encryption, or something, right? Ever tried to create a common MAC API that included something exotic like UMAC?) We don't have to introduce this level of complexity, and in fact we've created our own Netflix "cryptocommon" Java lib which strikes a very good balance between sufficient flexibility and intuitive usability. We've used that thing all over the place to wrap the sometimes bizarre JCE APIs.
> 
> We'd like to bring those learnings to bear on the current discussion, because allowing a sane collection of MACs (HMAC, ...), public key operations (RSA, ECC maybe DSA), symmetric key encryption (AES, 3DES, ....), hashing (MD5, SHA1, SHA-256, ...), and even key exchange (Diffie-Hellman & AES key unwrapping) is really not that difficult, IF you've been forced to think hard about the problem before.
> 
> Apologies for the very long response. If you've made it this far, I really appreciate your taking the time to read through this.
> 
> Regards,
> Mitch Zollinger
> 
> On 11/24/2011 5:40 AM, Stephen Farrell wrote:
>> 
>> Saying why would be interesting. Many people have said they can't
>> do TLS when its turned out that they could in fact do TLS so what
>> is it that you need that you can't get via TLS with key insertion
>> (for e.g. TLS-PSK renegotiation) and key extraction and some
>> simple functions to use extracted keys?
>> 
>> I realise a generic crypto API can be used for all sorts of fun,
>> but the claim here seems to be that such an API is necessary.
>> My claim is that such an API is basically JCE/JCA which is not
>> a simple API.
>> 
>> S.
>> 
>> On 11/24/2011 01:32 PM, David Dahl wrote:
>>> +1
>>> 
>>> ----- Original Message -----
>>>> From: "Mark Watson"<watsonm@netflix.com>
>>>> To: "Harry Halpin"<hhalpin@w3.org>
>>>> Cc: "Stephen Farrell"<stephen.farrell@cs.tcd.ie>, "<public-identity@w3.org>"<public-identity@w3.org>
>>>> Sent: Thursday, November 24, 2011 10:48:03 AM
>>>> Subject: Re: Drastically cutting primary features [was Re: Last call for public comments on Web Crypto charter]
>>>> Harry,
>>>> 
>>>> The possibility to develop secure application protocols in Javascript,
>>>> without using TLS, is exactly the one of the points of this API, at
>>>> least for us. The possibility to use pre-provisioned keys is also an
>>>> essential component. So I wouldn't be in favor of this change and I'm
>>>> not even sure it's a "simplification".
>>>> 
>>>> ...Mark
>>> 
>>> 
>> 
>> 
> 
> 

Social Web Architect
http://bblfish.net/
Received on Thursday, 1 December 2011 08:48:56 UTC