Re: Use cases / requirements for raw data access functions from Peter Thatcher on 2018-05-21 (public-webrtc@w3.org from May 2018)

From: Peter Thatcher <pthatcher@google.com>
Date: Mon, 21 May 2018 15:26:10 -0700
To: Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>
Cc: public-webrtc@w3.org
Message-ID: <CAJrXDUGUQ_YPosT72fWyZ6GSTpRY5QFSGApNwFpbWFZ1HdROxg@mail.gmail.com>
On Mon, May 21, 2018 at 2:42 PM Sergio Garcia Murillo <
sergio.garcia.murillo@gmail.com> wrote:

> On 21/05/2018 23:10, Peter Thatcher wrote:
>
> On Mon, May 21, 2018 at 1:54 PM Harald Alvestrand <harald@alvestrand.no>
> wrote:
>
>> On 05/21/2018 08:35 PM, Peter Thatcher wrote:
>>
>> On Sun, May 20, 2018 at 9:08 PM Harald Alvestrand <harald@alvestrand.no>
>> wrote:
>>
>>> On 05/18/2018 11:33 PM, Peter Thatcher wrote:
>>> > By the way, this is my use case: as a web/mobile developer, I want to
>>> > send media over QUIC for my own replacement for RTP.  QUIC isn't the
>>> > replacement for RTP, but my own protocol on top of QUIC (or SCTP, or
>>> > any data channel) is the replacement for RTP.    Luckily, this is
>>> > "free" as soon as you add QUIC data channels and split the RtpSender
>>> > into encoder and transport.  That's all I need.
>>> >
>>> Peter,
>>>
>>> since I'm asking this of others:
>>>
>>> can you please expand this into an use case - describe a function or
>>> application that is possible / easy when you have that functionality
>>> available, and is impossible / hard to do today?
>>>
>>
>> I have a few:
>>
>> - Terminating media (audio, video, data) on a server is a pain in the
>> neck with DTLS/RTP/RTCP/SCTP.   I would like that to be much easier.
>> Sending a serialized frame object (say, CBOR or protobuf) over a QUIC
>> stream is way easier to terminate.
>>
>>
>> Pushing it up one more layer: You want an use case where you send media
>> to a server, and because the server architecture is such that
>> DTLS/RTP/RTCP/SCTP is a pain to terminate, and you can't change the server
>> architecture, you want a different wrapping of the data?
>>
>
> Yep.
>
>
> You can do it also with websocket easily, so imho the interesting part of
> this use case is not how you would send the encoded frame, but the
> audio/video encoder part. In that regard we have to take care an check what
> is already specified in MSE/EME as we may several elements in common, or
> very close.
>

Websockets require TCP.  That's a complete non-starter for RTC.  But I'm
assuming QUIC data channels will happen, so I agree with you that the
interesting discussion is around encoders/decoders.


>
>
>
>
>>
>> - Including real-time data that's synchronized with audio and data, or
>> including metadata about a particular audio for video frame is much easier
>> if I can just attach it to a serialized frame object as mentioned above.
>>
>>
>> I can think of some example use cases here - subtitles are one of them.
>> Do you want to expand with some more specific ones?
>>
>
> Some ideas:
>
> - Something like "this frame is interesting" computed from the sender.
> Could be a reason to do something special on a server (switch users in a
> conference, start recording, ...).
> - Attaching a replacement for video onto audio frames, like info necessary
> to draw an avatar.
> -  An SFU provides metadata about the person speaking when it switches in
> a virtual stream of the current speaker.
>
>
> In IOT is quite common to have telemetry that needs to be in sync with the
> media. Note that in this use case the metadata should be attached to the
> media frame before being encoded not after.
>

Why does it have to be attached before encoding?


>
>
>
> - Taking more directly control over stream control signaling (mute state,
>> key frame requests, end-of-stream), etc is much easier if the control plane
>> is controlled by the application and can be integrated with the flow of
>> media, unlike today with RTCP (not under control of app) or data channels
>> (not integrated with flow of media).
>>
>> What's the application that would need more direct control over the
>> stream's state?
>>
>
> An SFU might want to say to a receiver "there is no audio here right now;
> no comfort noise, no nothing; don't mix it until you hear differently from
> me" instead of having a jitter buffer actively mixing silence for all
> silent streams.
>
>
> RTCP PAUSED indication. I have tried a couple of times to get it supported
> already.
>

Which I have to wait until all browsers support.  And then when I have a
use case that's not supported quite by RTCP PAUSED, I have to go get an
extension standardized in the IETF, and then get it supported in all the
browsers.

With a low-level control API, I wouldn't have this problem.


>
>
> Is this a requirement that would equally well be satisified by a direct
>> API to the RTCP?
>>
>
> Yes, especially if new RTCP messages types can be added.
>
>
> The only benefit of using RTCP is interoperability.  I don't see any
> benefit of adding any custom app RTCP messages compared to send an app
> message via DC or adding metadata to the media frames.
>

I agree that I'd rather not use RTCP and make my own protocol.  But if
someone else wants to use RTCP, I'm fine with them doing so.


>
> Best regards
>
> Sergio
>
Received on Monday, 21 May 2018 22:26:51 UTC