Re: [foaf-protocols] Discussion from W3C #idbrowser

From: Henry Story <henry.story@bblfish.net> · Date: Fri, 27 May 2011 02:41:12 +0200

Thanks,

   Brad for your long and detail reply, and especially the new angle you bring to the subject.
I answered as I went along, but it looks like we agree in the end anyway.

On 26 May 2011, at 20:05, Hill, Brad wrote:

> I was recently introduced to the WebID proposal by Henry's video presentation for the W3C Conference on ID in the Browser.  After a brief chat in that venue, I thought I should take my concerns here for further discussion.
> 
> While I conceptually like the idea, I fear it will face insurmountable obstacles to deployment in its current description.  
> 
> Primarily, I believe that interrupting the TLS handshake as part of the protocol is a complete non-starter for a variety of reasons.

We seem to have had not too much problem on a very wide variety of platforms, as the (certain incomplete) implementations list on  

          http://www.w3.org/wiki/Foaf%2Bssl    

will show. So it has gotten past the starting block. But I presume you have large deployments in mind.... :-)

> 
> The quick summary of my concerns is that:
> 
> 1. The existing install base of TLS terminators cannot support the protocol
> 2. TLS terminators must communicate WebID context to apps
> 3. Performance and scalability is terrible relative to server-auth-only TLS
> 4. A major denial-of-service attack surface is introduced by the protocol
> 
> 
> If you are ambitious to read more:
> 
> 1.  Deployment obstacles with the existing infrastructure of TLS accelerator and terminators and appliances.
> 
> It is a reality of most large sites deploying HTTPS that TLS is not terminated at the web/application server where the application is hosted but instead in an accelerator card or network appliance.  Changing the internals of the TLS handshake will be an extremely uphill battle given the enormous install base and reluctance of operations staff to modify these systems.

I think one would have to look in detail at what these cards do, and how they can be tuned. Presumably they all have
ways to be updated with new CA certificates. Perhaps many of them have ways to be tuned so that they accept non CA verified certificates anyway in order to pass it to the next layer.

For example I wonder if Google is using these types of cards, given articles such as

  http://www.imperialviolet.org/2010/06/25/overclocking-ssl.html

> It is also not clear that many such appliances would be able to handle the proposed and envisioned protocol modifications, at least not without significant degradation of performance and scalability.

The good thing here is that if they cannot apply it it's not a problem. They can use existing methods, such as
username/password, OpenId or whatever else they are currently using. WebID does not exclude the other authentication 
systems from functioning.

> 
> 2. Lack of standardized mechanism to pass WebID context from termination of TLS to where it will be applied.

> Assuming that a termination device does participate in the protocol, it is safe to assume that it does not understand application,  identity or authorization semantics.   It must either call into application logic for such information (while still in the middle of the TLS handshake) if it is to return error conditions at the TLS layer,

In Clerezza (java + scala) this is done in 2 stages:

  1. The lower level of TLS verify that the client has the private key of the public key sent in the certificate
  2. Then a wrapper checks if the certificate contains a WebID. If it does 
     a) it passes it up to the higher layer
     b) otherwise it attempts CA verification
     see: http://bit.ly/mvMvAH 
  3. At the servlet layer, a servlet authentication filter checks through the different authenticators and the
    different credentials are added to a Subject
       http://bit.ly/lIVfWg 
     Each SAN is checked and verified (but it could also just stop at the first). Verified SANs go in the Principals set.     

In fact this is very useful to the write debugging tools, and also help users who have broken certificates. 

This is a big improvement. The current out of the box SSL implementations have TERRIBLE user interface properties.  If the certificate is in NEED mode, then the connection is just broken and nothing is explained. Much better in my opinion to pass the certificate on, and then redirect to a beautiful and carefully crafted explanation on what is wrong.

Ok this one is not beautifully crafted - did not have the time - but it gives an example of how one can use this
to debug a client certificate.

   https://bblfish.net:8443/test/WebId

We have a few of these (and should make a list of them in fact)

>  or must have a way to pass arbitrary (and potentially large) context information down to the application after completion of the handshake.  This needs to be invented and is another change required of both limited devices as well as downstream consumers.

This seems to happen in all libraries we have seen. You can do this with Apache or with Java...

There may be ways of making things more efficient that what we do currently.

> 
> 3. Performance
> 
> Yes, this does matter, and yes this is a big deal.  Witness the proliferation of accelerator appliances to reduce resource costs, and the interest in False Start, to reduce latency.   While TLS is not a huge cost (and I have for years encouraged many major sites to go to full HTTPS) the additional cost of client certificate auth relative the baseline cost of TLS is significant, and this is even more true when the WebID modifications are added.

Ok but consider the cost of OpenID

   http://blogs.oracle.com/bblfish/entry/the_openid_sequence_diagram

And though OpenId has problems one cannot say that it has had insurmountable problems for
its deployment.

> 
> Consider the differences.
> 
> No client cert:
> 
> 3 round trips (or 2 with false start)
> 1 asymmetric crypto operation
> 
> WebID client cert:
> 
> Initial connection:
> 3 round trips (client can't use false start because it will have to renegotiate anyway, see below)
> 1 asymmetric crypto operation (as part of key exchange)
> 
> Renegotiation is then required.  The client certificate cannot be sent as part of the first handshake for privacy reasons - it is in the clear.  The client and server must establish an encrypted channel and then renegotiate to send the client certificate if the user's expectation that the identity they are using at the site will be private to eavesdroppers is to be preserved, so now a new handshake begins including:
> 
> 3 round trips (or 2 with false start)
> 3 asymmetric crypto operations (key exchange, plus client cert verify, plus CertificateVerify message)
> Possible dereference of AIA information (multiple certs + CRL or OCSP) on client cert (1 or 2 round trips)
> Additional 2 asymmetric crypto ops to verify cert and signature on CRL or OCSP
> 
> Now the server has to make an entire new TLS connection to pull the WebID metadata.  (we will discount the cost of fetching and processing this for now), so 3 round trips (or 2 with false start), 2 additional asymmetric crypto operations, plus the cost of fetching, as a client, AIA information provided by the server.  ( 1 or 2 round trips though this may likely be cached, plus 2 more asymmetric crypto ops) 
> 
> So normal case, we have 2-3 round trips and 1 asymmetric crypto operation for a typical TLS handshake vs. 11-13 round trips, 10 asymmetric crypto operations and 2 -4 resource downloads for a WebID-style client cert authentication.  This is a 60-80% reduction in capacity, and a 4-5x increase in latency - hard to ignore no matter how cheap TLS is.

Thanks I'll use that as a reference. 

> 
> 4. Attack surface
> 
> This so far just assumes the "good" case.  Perhaps the most troubling aspect of the system currently is that it allows an attacker to force a server to dereference arbitrary content in the middle of a TLS handshake.  I can send a client certificate with AIA information pointing to a large set of large certificates that are expensive to verify (huge key sizes and large certificate sizes).

I think it would be very reasonable for servers to stop downloading after a certain limit, or to do so much more slowly. 
The linked data space needs to be very aware of these types of attacks.

>  I can then provide huge CRLs.  I can force the download of this information to be very slow, tying up server resources.  I can place these resources themselves at HTTPS URLs, forcing additional work.   I can make my WebID information itself huge, slow to download and expensive to verify.
> 
> Caching is no help here, as the attacker gets to choose the resources to be retrieved and can pick ones which will not be cached.  It is also important to consider, in general, that terminator appliances are not generally designed with caching in mind, as it is not a typical use case for server-auth-only TLS today.

yes.

> 
> Solutions?
> 
> As a first solution, I would suggest moving the retrieval and validation of WebID data out of band from completion of the TLS handshake.  Always complete the handshake, pass the certificate data to the application, and let it complete the rest of the WebID protocol.  This is much simpler for terminator appliances and they already have established methods for doing this. 

Ah ok. Looks like this is how this is done in Clerezza. Right? 
 First verify just public key, the pass it on to app without CA verification step. The middle tier layer can deal with further verification. In fact a really clever app could even give the user partial access and update the page with ajaxy stuff when the further confirmations come it, if they do...

TODO: Perhaps this is something we need to look at the spec a lot more carefully http://webid.info/spec/ to see if what we say there is actually conforming to what we are doing.

> 
> I would also suggest that wherever TLS handshakes are being performed, that all verification for client certs be turned off completely (AIA, CRL, OCSP, etc.) as this is all unnecessary in the WebID model and further reduces the attack surface.

[[ for people less technical:

"Authority Information Access (AIA)"  A certificate extension that contains information useful for verifying the trust status of a certificate. This information potentially includes Uniform Resource Locations (URLs) where the issuing CA’s certificate can be retrieved, as well as a location of an Online Certificate Status Protocol (OCSP) responder configured to provide status for the certificate in question. The AIA extension can potentially contain HTTP, LDAP, or file URLs.

"CRL Distribution Point (CDP)"  A certificate extension that indicates where the certificate revocation list for a CA can be retrieved. It can contain none, one, or many HTTP, file, or LDAP URLs.

"Online Certificate Status Protocol (OCSP)"  A protocol that allows real-time validation of a certificate’s status by having the CryptoAPI make a call to an OCSP responder and the OCSP responder providing an immediate validation of the revocation status for the presented certificate. Typically, an OCSP responder responds with the revocation status check request based on the CRLs or other forms of revocation status it retrieves from the CAs.
]]

Absolutely!

> 
> This reduces the changes necessary and risk to terminator appliances, and places the responsibility for caching and mitigating attack surface in the application, where it can be most appropriately managed.
> 
> I think more work could be done on attack surface reduction, but this is already far too long for a first email.

:-)

> 
> -Brad Hill
> _______________________________________________
> foaf-protocols mailing list
> foaf-protocols@lists.foaf-project.org
> http://lists.foaf-project.org/mailman/listinfo/foaf-protocols

Social Web Architect
http://bblfish.net/