Summary of e2e encryption discussions

Hi all,

Following on last F2F meeting, please find below some notes on WebRTC end-to-end encryption.
This should complete my related action item.

During the discussions, three security models were identified:
1. Trusted web application: a trusted website uses intermediate nodes from a potentially untrusted WebRTC provider
2. Partially trusted web application: a trusted web site uses a SDK from a potentially untrusted WebRTC provider
3. Untrusted web application

Each use case is described in more details below and some personal thoughts are at the end of the message.
Alex, Sergio, let me know if you have ideas on how to best format that information and/or how to move forward the topic.
I would tend to focus on use case 1, and optionally investigate how useful and deployable opaque/isolated streams are.

Thanks,
 Y

In the context of a WebRTC exchange, two types of nodes can be identified:
Final nodes are producing and consuming content. These are typically web applications.
Intermediate nodes allow routing content to the final nodes. These are for instance SFUs.

1. Trusted web application
Web applications are trusted and intermediate nodes are not trusted.
This is a scenario that happens in the banking industry: banks control the applications but delegate the network to a potentially untrusted WebRTC provider.

Requirements:
- The content needs to be protected from being accessed by intermediate nodes.
- The content does NOT need to be protected from being accessed by web applications.

Potential solution:
- Apply encryption on the content that intermediate nodes cannot decrypt. This encryption would be done in addition and before the DTLS encryption.
- The webrtc browser layer could implement this additional encryption with encryption keys provided by the web application through a specific web API.
- The web application could implement this additional encryption itself if enough media pipeline low level JS APIs were provided.

2. Partially trusted web application
Web applications run some code that is trusted and run some code that is untrusted.
This can for instance be the case when the top level page « mytrustedwebsite.com » embeds an iframe from « mywebrtcprovider.com ».
The iframe implements all the WebRTC black magic, including potentially media content rendering.
Intermedia nodes are not trusted.

Requirements:
- The content needs to be protected from being accessed by intermediate nodes.
- The content does NOT need to be protected from being accessed by the top level page.
- The content needs to be protected from being accessed by iframe.

Potential solution:
- Apply encryption on the content that intermediate nodes cannot decrypt. Compared to the trusted web application case, the iframe cannot have direct access to the encryption keys. The top level page might have access and register them to the browser. Key identifiers might be shared with the iframe.
- Make the outgoing content be opaque in the untrusted iframe. One possibility is for the iframe to call directly getUserMedia and to receive an opaque stream. This could be achieved with some mark-up like allow=‘opaque-camera’ on the iframe. A second possibility is for the top-level page to call getUserMedia, receive a non-isolated stream and transfer the stream to the untrusted iframe, the stream becoming opaque at this time.
- Make the incoming streams protected by double encryption be opaque by default or depending on how the encryption keys were retrieved. The top level page could also be able to remove this opaque protection when needed.

Potential issues:
- Text chat is not straightforward to implement: the top-level page would need to handle the input/output of text so that the iframe has access to encrypted text data only, or some new HTML construct would be needed.

3. Fully untrusted web application
Neither web applications nor intermediate nodes are trusted.
The content needs to be protected from being accessed by the web application.
The content needs to be protected from being accessed by intermediate nodes.

Potential solution:
- Apply encryption on the content that intermediate nodes cannot decrypt. Web application cannot get any access to the keys. A configuration step is required by the webrtc layer and could rely on an IdP infrastructure.
- Let the web application call getUserMedia to produce an outgoing stream that needs to be opaque. This could rely on an IdP infrastructure and would require a dedicated UI at getUserMedia prompt time. This could also be done through some configuration settings.
- Make any incoming stream protected by double encryption be opaque.
- Provide access to the encryption keys outside of the web application, for instance using an IdP infrastructure.

Potential issues:
- Text chat does not seem to be possible without some new HTML construct.
- This relies on IdP getting some adoption.

Some personal thoughts:
Focusing on use case 1 (fully trusted web application) might allow making progress in this area. It has a well defined scope and would be a base brick for use cases 2 and 3.
Use case 2 has a somewhat wider scope and a limited complexity. It should first be proved that opaque streams would be actually deployed as it can cause potential user experience issues. For instance, in multi-party video conference scenarios, it is desirable to update the UI based on who is speaking, silence detection might help improve audio quality, a microphone level meter is often available…
Use case 3 has a similar issue with regards to opaque streams. It would also rely heavily on IdP or a mechanism similar to IdP. It is unclear whether there is sufficient interest in that area and how much a good getUserMedia prompt UI could be designed. 

It might also be beneficial to study WebRTC broadcasting and EME-like scenarios as the concept of opaque media content might prove to be useful in that context.

Received on Friday, 22 June 2018 02:56:26 UTC