Re: Raw data API - 3 - Encoded data from Randell Jesup on 2018-05-30 (public-webrtc@w3.org from May 2018)

From: Randell Jesup <randell-ietf@jesup.org>
Date: Wed, 30 May 2018 15:49:43 -0400
To: public-webrtc@w3.org
Message-ID: <180a9c4b-77fe-ffba-54d4-7d3ddd546172@jesup.org>
On 5/30/2018 12:29 PM, Peter Thatcher wrote:
> This gets to my question I want to cover at the f2f (and which we now 
> have agenda time for): how low-level an API do we want?  With a 
> low-level RTP transport object, a JS/wasm app could implement a 
> high-level RtpSender on top.  But is that the right balance between 
> flexibility and ease of use?  How much do we want to rely on libraries 
> to fill in the higher-level stuff?
>
> I feel like providing low-level components and letting the js/wasm 
> libraries fill in the higher-level stuff is the right way to go for 
> most things, but finding the right line is a bunch of tradeoffs and I 
> think there's a lot of the WG to consider.

Another thing to consider (and this applies to a lot of the discussions 
going on for the f2f): JS isn't realtime.  Not even close.  When you 
look at profiles like I do, and see JS pauses for GC/CC, 
layout/rendering, other things jumping in, etc...  When one page leaks 
(I'm looking at you adsafeprotected.com (anti-clickfraud provider) - 
though I poked them hard and they fixed it finally), you can see LONG 
GC/CC times.  Like 100's of ms.  or seconds.

Right now WebRTC is so isolated from JS land (at least in Firefox) that 
you can have a call continue for 10 minutes after the mainthread has 
deadlocked (until consent-refresh kills you, for example).  Push tons of 
stuff in to JS, and you're going to majorly destabilize timing and 
latency and framerate.  If you push most of this off into Workers 
running WASM, you may avoid the random latencies, but at some real costs 
in making everything multithreaded to the extreme.  And even if you can 
avoid adding delay or jitter to the media, you're going to be using more 
RAM and CPU especially on mobile devices.  WASM is good, true, but it's 
not a panacea.

Sure, it's cool to be able to do this in JS... but do you need to? What 
are you buying, and what does it cost you?  What's the usecase you're 
fulfilling that can't be handled today, and are there other options?  
For some we'll have good answers and good reasons - but let's enumerate 
them and make sure.  I'm certain that for others we don't have 
sufficient justification.

Also, an aspect that came up repeatedly during the early webrtc 
discussions: pushing stuff into libraries has negative impacts too. How 
many ancient versions of jquery with which random bugs are out there?  
Stuff built into the browser can be updated and work better, and users 
will benefit.  Major sites may be smart about updating dependencies, but 
lots of others won't and may never realize why their audio drops out 
randomly (due to a library bug).  I realize there are counter arguments 
- but we should make sure we're really doing something better for users 
(and developers), and not just creating a hydra.

Trust me, hydras are not good.  :-)

     Randell Jesup, Mozilla

>
> On Tue, May 29, 2018 at 3:24 AM Sergio Garcia Murillo 
> <sergio.garcia.murillo@gmail.com 
> <mailto:sergio.garcia.murillo@gmail.com>> wrote:
>
>     I fail to see what would be the benefits of adding the media frame
>     API to the RTP objects, especially if we intend to provide a lower
>     raw rtp API.
>
>     In order to be able to implement this API on the browser, the
>     codec packetization of the encoded stream must be known by the
>     browser. Also, it is not possible to modify the frame, for example
>     applying end to end media encryption, as packetization requires to
>     have access to the raw data or add metadata for the very same reason.
>
>     If the API would only alow to forward the frames from an
>     encoder/decoder (or another media source/sink) to the rtp objects,
>     I would prefer a higher level API that deals with streams and not
>     individual packets.
>
>     Best regards
>
>     Sergio
>
>     On 29/05/2018 8:30, Harald Alvestrand wrote:
>>
>>     **
>>
>>
>>         *Access to encoded streams*
>>
>>     *
>>
>>     A similar interface can be added to RTPSender and RTPReceiver,
>>     respectively (and similar APIs for other transports, when defined).
>>
>>
>>     Here the buffers would contain encoded video / audio data, and
>>     the control block parameters would have to have enough
>>     information to let the RTP packet headers be constructed - or, on
>>     the receiver, the info from the RTP packet headers be represented.
>>
>>
>>     Partial interface RTPSender {
>>
>>        promise<encodedBuffer> injectData(encodedBuffer);
>>
>>     }
>>
>>
>>     Partial interface RTPReceiver {
>>
>>        promise<encodedBuffer> extractData(encodedBuffer);
>>
>>     }
>>
>>
>>     Interface encodedBuffer : Buffer {
>>
>>        Long rtpTimestamp;
>>
>>        Long frameId;
>>
>>        sequence<long> dependsOnFrames;
>>
>>         // more fields TBD
>>
>>     }
>>
>>
>>     On this interface, frames do have interdependencies, so dropping
>>     packets is much more problematic. The “dependsOnFrames” member is
>>     intended to help deciding on sensible handling - it would tell
>>     the other side of the API that “if you dropped one of these
>>     frames, you might as well drop this frame too”.
>>
>>     *
>

-- 
Randell Jesup -- rjesup a t mozilla d o t com
Please please please don't email randell-ietf@jesup.org!  Way too much spam
Received on Wednesday, 30 May 2018 19:52:32 UTC