Re: Raw data API - 6 - Encoders/decoders from Sergio Garcia Murillo on 2018-06-14 (public-webrtc@w3.org from June 2018)

From: Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>
Date: Thu, 14 Jun 2018 10:18:34 +0200
To: Peter Thatcher <pthatcher@google.com>
Cc: public-webrtc@w3.org
Message-ID: <e38d6f4a-297f-a4c0-a95a-0c0ac77f2615@gmail.com>

On 14/06/2018 4:35, Peter Thatcher wrote:
> Yes, that's a valid concern.  But that concern is the same as for any 
> time the app gets in the media flow, such as when it does processing 
> for funny hats or adds encryption.  It might take some implementation 
> experience to figure out where the really perf issues are.
>  [...]
>  If the app can't process things per-frame, then you might as well 
> just go back to having an RtpSender like in ORTC that doesn't require 
> the app to do anything on the media path.
>  [...]
> But that requires every piece to be implemented by the browser, and I 
> don't see how that buys us much vs. an ORTC-style RtpSender.   It 
> would be much better if we could find a way to make wasm/js performant 
> in the media path.
> [...]
> It would be easy to make the encoders and decoders use WHATWG streams 
> instead of events.  I just don't see the benefit of having an encoder 
> stream tied to a transport stream with no app in between except 
> plugging it together and then hoping that it will be performant 
> because we expect a sufficiently smart browser implementation.

I think we have a mayor disagreement here. Your assumption is that 
webrtc wants the low level APIs because they want to do funny things 
with the raw data. IMHO that is not true.

WebRTC developers just hate SDP, dislike transceivers, senders and 
receivers and trying to get current APIs to work is a pain even for the 
simple cases. ORTC is just a bit better, but not much, with the encoding 
parameters stuff being just a  json-version of an m-line, not much 
improvement there.

So, after failing to provide a high level API that is usable, developers 
are requesting a lower level API with a sane API, not because they all 
want to deal with raw rtp packets, do funny hats or send rtcp, but 
because we want to use something that is simpler, more fine grained and 
with a better control surface. Some of them may want to also perform raw 
operations in one way or another, so we should also support that use cases.

In that regards the using whatwg streams provides a simple API surface 
with finer grained control of each of the components in the pipe line. 
The same stream API will most probably used in other areas of the 
browser as well, so the web developers will already have knowledge about 
it. Also, it allows to implement  the raw use cases that we were trying 
to implement, even some of them (custom encoders, funny hats) are 
explicitly mentioned in their documentation as a target of the API.

Does this API provide any advantage over the ORTC-sytle RTPSender if we 
just don't want to do any raw processing? Yes, I would even re-implement 
a standard rtp stack on js than having to use the ORTC sender/receiver 
stuff.

Last point, the whatwg group has been working on the API for years now, 
so we can leverage on their work and cut down the time to market of the 
webrtc nv API.

Let's put it the other way, what are the benefits of your API compared 
to the whatwg streams?

Best regards
Sergio

Received on Thursday, 14 June 2018 08:17:58 UTC