W3C home > Mailing lists > Public > public-webrtc@w3.org > June 2018

Re: Raw data API - 6 - Encoders/decoders

From: Peter Thatcher <pthatcher@google.com>
Date: Thu, 14 Jun 2018 02:55:12 -0700
Message-ID: <CAJrXDUE8US=Rj=rCWDnDxh6UfgtTyfwr845qjShnd1UQ6T0ExA@mail.gmail.com>
To: Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>
Cc: public-webrtc@w3.org
On Thu, Jun 14, 2018 at 1:17 AM Sergio Garcia Murillo <
sergio.garcia.murillo@gmail.com> wrote:

> On 14/06/2018 4:35, Peter Thatcher wrote:
> Yes, that's a valid concern.  But that concern is the same as for any time
> the app gets in the media flow, such as when it does processing for funny
> hats or adds encryption.  It might take some implementation experience to
> figure out where the really perf issues are.
>  [...]
>  If the app can't process things per-frame, then you might as well just go
> back to having an RtpSender like in ORTC that doesn't require the app to do
> anything on the media path.
>  [...]
> But that requires every piece to be implemented by the browser, and I
> don't see how that buys us much vs. an ORTC-style RtpSender.   It would be
> much better if we could find a way to make wasm/js performant in the media
> path.
> [...]
> It would be easy to make the encoders and decoders use WHATWG streams
> instead of events.  I just don't see the benefit of having an encoder
> stream tied to a transport stream with no app in between except plugging it
> together and then hoping that it will be performant because we expect a
> sufficiently smart browser implementation.
> I think we have a mayor disagreement here. Your assumption is that webrtc
> wants the low level APIs because they want to do funny things
with the raw data.

My focus is not on doing funny things with raw data.  In fact, the purpose
of proposing the encoders and decoders the way I have is so that apps can
have control over media flow without needing to do anything with raw data.
So it's kind of the opposite.

IMHO that is not true.
> WebRTC developers just hate SDP, dislike transceivers, senders and
> receivers and trying to get current APIs to work is a pain even for the
> simple cases. ORTC is just a bit better, but not much, with the encoding
> parameters stuff being just a  json-version of an m-line, not much
> improvement there.

"WebRTC developers" are not a homogenous group.  I know of many that have
things they are doing with mobile/native/server endpoints that they can't
do with web endpoints because the Web API isn't sufficiently capable.

We started ORTC years ago to try and provide something better.  Our
experience over the last few years with more mobile and server endpoints
has taught us lessons of how ORTC can be improved upon.   So while I think
it's a good starting point for WebRTC NV, I think we can do even better.
In particular, two things: 1.  Split RtpSender into encoder and transport
(and similar for RtpReceiver) and 2. have more control over ICE.   There
are others, but those are the main ones to me.

> So, after failing to provide a high level API that is usable, developers
> are requesting a lower level API with a sane API, not because they all want
> to deal with raw rtp packets, do funny hats or send rtcp, but because we
> want to use something that is simpler, more fine grained and with a better
> control surface. Some of them may want to also perform raw operations in
> one way or another, so we should also support that use cases.

That's the key: finding the right balance between allowing apps to do
low-level control when they need to and not requiring them to do so when
they don't want to.  It's a difficult balance to find, which is why so much
of my effort, and my presentation time at the f2f, is focused on use cases
an finding this balance.

> In that regards the using whatwg streams provides a simple API surface
> with finer grained control of each of the components in the pipe line. The
> same stream API will most probably used in other areas of the browser as
> well, so the web developers will already have knowledge about it. Also, it
> allows to implement  the raw use cases that we were trying to implement,
> even some of them (custom encoders, funny hats) are explicitly mentioned in
> their documentation as a target of the API.

Discussing WHATWG streams at this point is like discussing promises in the
1.0 API.  It's an important discussion to have, but it's not the high order
bit, and it's mostly orthogonal.  The high order bit is what
objects/components we will have and how much low-level or high-level the
API should be (how much the app can do, how much the app has to do).
Whatever choices we make with the high-order bits can work with or without
WHATWG streams.

> Does this API provide any advantage over the ORTC-sytle RTPSender if we
> just don't want to do any raw processing? Yes, I would even re-implement a
> standard rtp stack on js than having to use the ORTC sender/receiver stuff.

I'm curious what you think is wrong with the ORTC RtpSender and
RtpReceiver.  Have you used it?  What was difficult?  You said something
about them being like SDP m-lines, but I don't see how that's the case.

> Last point, the whatwg group has been working on the API for years now, so
> we can leverage on their work and cut down the time to market of the webrtc
> nv API.

Perhaps, but that's not the high-order bit.  We have important questions to
answer first.

> Let's put it the other way, what are the benefits of your API compared to
> the whatwg streams?

The components I'm proposing can be WHATWG stream-ish or not.  It's
orthogonal.  My hope is that at the f2f we can come to agreement on the use
cases, the requirements, and low-level-ness and the components we want so
we can move the discussion forward to the point where it makes sense to
discuss whether to use WHATWG streams or not.

> Best regards
> Sergio
Received on Thursday, 14 June 2018 09:56:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:18:42 UTC