What would you like to see in WebRTC next? A low-level API?

For a long time, there has been interest in extending WebRTC to allow for
more low-level control (at times called "WebRTC NV", similar to ORTC).
Recently, at TPAC 2017, since WebRTC 1.0 went to CR, we began to seriously
discuss what to do next, and there was a lot of interest in low-level ICE
controls and in QUIC-based data channels.  Since then, we started
extensions specs for QUIC and ICE, and there has been discussion around
SCTP data channels API and WHATWG streams.

With so many options available for what to do next, it's probably a good
time to step back, look at the big picture and ask:  What do we want to
accomplish with WebRTC next, and how are we going to get there?

*Please write back to this thread and let the WG know what you'd like to
see in WebRTC next and why*.  In particular, if you expressed interested at
TPAC for QUIC or low-level ICE, please let us know why you are interested.

To answer my own question, here's where I think we should head and how we
can get there:

*I propose we build a true Low-Level WebRTC API.  *

As an app developer, I want the lowest-level API I can get.  I can build
the app up from low-level primitives, or rely on libraries to do so.  This
will allow me to improve my app without waiting for standardization,
browser implementations, browser bug fixes, and browser performance
improvements.  I will be able to control my own destiny to a greater extent
and make me dependent on a smaller surface area.   As a browser
implementer, this also means less that developers are relying on me to
standardize, implement, and maintain.

Ideally, we could provide the same APIs that mobile apps get: raw network
access and low-level access to hardware encoders/decoders.  The web can't
go quite that low.  But we can get close.

*We can get a great low-level API by a few incremental improvements*

We start with a existing combination of APIs that approximates a good
low-level API: getUserMedia + WebAudio/canvas + data channels.  It's
clunky, but it's there.   For example, one can already capture audio using
getUserMedia+WebAudio, encode it using ASM.js (or WebASM?) and send it over
the network using the data channel from PeerConnection, and then do
similarly on the receive side.  We can improve on the situation by:

1.  Adding low-level p2p transports (ICE or SLICE)
2.  Adding low-level media/data transports (DTLS/SCTP or QUIC)
3.  Adding low-level encoder/decoders (like Android's MediaCodec API)

It's all incremental, it's better each step of the way, and after doing
each of them, we have a great low-level API.  For example, one could
capture and encode video using getUserMedia and hardware encoders, then
send it over the network using SCTP or QUIC directly, and then do similarly
on the receive side.  One could also tightly control which network path(s)
media is sent over at any given time.

That's what I would like to see us pursue.  But again, please write back to
this thread and let the WG know what you'd like to see.

Received on Tuesday, 23 January 2018 22:56:06 UTC