W3C home > Mailing lists > Public > public-webrtc@w3.org > June 2018

Re: Raw data API - 6 - Encoders/decoders

From: T H Panton <thp@westhawk.co.uk>
Date: Thu, 14 Jun 2018 10:50:37 +0100
Message-Id: <79D10AD0-F17C-4C01-BA38-28AA38584904@westhawk.co.uk>
Cc: Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>, Peter Thatcher <pthatcher@google.com>, public-webrtc@w3.org
To: Lorenzo Miniero <lorenzo@meetecho.com>


> On 14 Jun 2018, at 09:55, Lorenzo Miniero <lorenzo@meetecho.com> wrote:
> 
> On Thu, 14 Jun 2018 10:18:34 +0200
> Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com <mailto:sergio.garcia.murillo@gmail.com>> wrote:
> 
>> On 14/06/2018 4:35, Peter Thatcher wrote:
>>> Yes, that's a valid concern.  But that concern is the same as for
>>> any time the app gets in the media flow, such as when it does
>>> processing for funny hats or adds encryption.  It might take some
>>> implementation experience to figure out where the really perf
>>> issues are. [...]
>>>  If the app can't process things per-frame, then you might as well 
>>> just go back to having an RtpSender like in ORTC that doesn't
>>> require the app to do anything on the media path.
>>>  [...]
>>> But that requires every piece to be implemented by the browser, and
>>> I don't see how that buys us much vs. an ORTC-style RtpSender.   It 
>>> would be much better if we could find a way to make wasm/js
>>> performant in the media path.
>>> [...]
>>> It would be easy to make the encoders and decoders use WHATWG
>>> streams instead of events.  I just don't see the benefit of having
>>> an encoder stream tied to a transport stream with no app in between
>>> except plugging it together and then hoping that it will be
>>> performant because we expect a sufficiently smart browser
>>> implementation.  
>> 
>> I think we have a mayor disagreement here. Your assumption is that 
>> webrtc wants the low level APIs because they want to do funny things 
>> with the raw data. IMHO that is not true.
>> 
>> WebRTC developers just hate SDP, dislike transceivers, senders and 
>> receivers and trying to get current APIs to work is a pain even for
>> the simple cases. ORTC is just a bit better, but not much, with the
>> encoding parameters stuff being just a  json-version of an m-line,
>> not much improvement there.
>> 
>> So, after failing to provide a high level API that is usable,
>> developers are requesting a lower level API with a sane API, not
>> because they all want to deal with raw rtp packets, do funny hats or
>> send rtcp, but because we want to use something that is simpler, more
>> fine grained and with a better control surface. Some of them may want
>> to also perform raw operations in one way or another, so we should
>> also support that use cases.
>> 
> 
> 
> +1 on all of the above...
> 


I've been doing some work in WebAudio recently - where there is a 
nice selection of pluggable pre-built transforms and the ability to code your own 
in javascript. 

My experience is that it is surprisingly hard to get the results you want in a reliable way
from the code-your-own transforms, you need to know quite a lot about the rest of the pipeline
(timing, sample rate, jitterbuffers, memory allocation, echo cancellers). Whereas, by contrast
the pre-builts generally fit together reasonably well.

The other things to note from WebAudio is that the thing that makes the biggest difference is the
consistency of the API design, followed by the developer tooling - WebAudio would be
much harder without Firefox's WebAudio visualization tool in the console. 

WebAudio isn't without it's faults - but those are mostly in the quality of the implementations,
not the API or tools.

 

> 
>> In that regards the using whatwg streams provides a simple API
>> surface with finer grained control of each of the components in the
>> pipe line. The same stream API will most probably used in other areas
>> of the browser as well, so the web developers will already have
>> knowledge about it. Also, it allows to implement  the raw use cases
>> that we were trying to implement, even some of them (custom encoders,
>> funny hats) are explicitly mentioned in their documentation as a
>> target of the API.
>> 
>> Does this API provide any advantage over the ORTC-sytle RTPSender if
>> we just don't want to do any raw processing? Yes, I would even
>> re-implement a standard rtp stack on js than having to use the ORTC
>> sender/receiver stuff.
>> 
> 
> 
> -1 on an RTP stack in JS :-P
> 

-100  on an RTP stack in JS

I've actually written an RTP + audio stack in a garbage collected language.
Let's not go there.

Increasing numbers of calls will be between webRTC browsers and non-browser endpoints,
doorbells, conference systems, drones etc. We need to avoid assuming that high performing
javascript engines will be available at both ends.


> I think Peter answered that with the "requires every piece to be
> implemented by the browser" argument, which is not a good enough reason
> for me. I understand that maintaining a WebRTC stack in C/C++ is
> complicated, but it's worth it. WebRTC hid most of the lowest level
> stuff (e.g., sending and receiving UDP) within the browser for many
> reasons (like preventing DoS attacks originated by JavaScript), and all
> this wasm nonsense to do everything in JavaScript brings us several
> steps back, meaning that at the very least those same arguments should
> be dealt with once more before anything happens.


The strength of webRTC (compared to softphones) is the possibility to integrate with
other web APIs - WebAudio, WebGL, WebUSB, Oauth etc. 
Whenever we needlessly strike out on our own we alienate huge numbers of web developers.



> 
> Of course, just my two cents,
> Lorenzo

My 1cent.


> 
> -- 
> I'm getting older but, unlike whisky, I'm not getting any better
> https://twitter.com/elminiero <https://twitter.com/elminiero>

Received on Thursday, 14 June 2018 09:51:22 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:18:42 UTC