Re: Use cases / requirements for raw data access functions from Peter Thatcher on 2018-05-21 (public-webrtc@w3.org from May 2018)

From: Peter Thatcher <pthatcher@google.com>
Date: Mon, 21 May 2018 14:10:12 -0700
To: Harald Alvestrand <harald@alvestrand.no>
Cc: public-webrtc@w3.org
Message-ID: <CAJrXDUE4h3iNbkXwVQb+R_3+Zu9u9G9_6dBvWV-JNHOG9SfWAw@mail.gmail.com>
On Mon, May 21, 2018 at 1:54 PM Harald Alvestrand <harald@alvestrand.no>
wrote:

> On 05/21/2018 08:35 PM, Peter Thatcher wrote:
>
> On Sun, May 20, 2018 at 9:08 PM Harald Alvestrand <harald@alvestrand.no>
> wrote:
>
>> On 05/18/2018 11:33 PM, Peter Thatcher wrote:
>> > By the way, this is my use case: as a web/mobile developer, I want to
>> > send media over QUIC for my own replacement for RTP.  QUIC isn't the
>> > replacement for RTP, but my own protocol on top of QUIC (or SCTP, or
>> > any data channel) is the replacement for RTP.    Luckily, this is
>> > "free" as soon as you add QUIC data channels and split the RtpSender
>> > into encoder and transport.  That's all I need.
>> >
>> Peter,
>>
>> since I'm asking this of others:
>>
>> can you please expand this into an use case - describe a function or
>> application that is possible / easy when you have that functionality
>> available, and is impossible / hard to do today?
>>
>
> I have a few:
>
> - Terminating media (audio, video, data) on a server is a pain in the neck
> with DTLS/RTP/RTCP/SCTP.   I would like that to be much easier.  Sending a
> serialized frame object (say, CBOR or protobuf) over a QUIC stream is way
> easier to terminate.
>
>
> Pushing it up one more layer: You want an use case where you send media to
> a server, and because the server architecture is such that
> DTLS/RTP/RTCP/SCTP is a pain to terminate, and you can't change the server
> architecture, you want a different wrapping of the data?
>

Yep.


>
>
> - Including real-time data that's synchronized with audio and data, or
> including metadata about a particular audio for video frame is much easier
> if I can just attach it to a serialized frame object as mentioned above.
>
>
> I can think of some example use cases here - subtitles are one of them. Do
> you want to expand with some more specific ones?
>

Some ideas:


- Something like "this frame is interesting" computed from the sender.
Could be a reason to do something special on a server (switch users in a
conference, start recording, ...).

- Attaching a replacement for video onto audio frames, like info necessary
to draw an avatar.

-  An SFU provides metadata about the person speaking when it switches in a
virtual stream of the current speaker.


>
> - Taking more directly control over stream control signaling (mute state,
> key frame requests, end-of-stream), etc is much easier if the control plane
> is controlled by the application and can be integrated with the flow of
> media, unlike today with RTCP (not under control of app) or data channels
> (not integrated with flow of media).
>
>
> What's the application that would need more direct control over the
> stream's state?
>

An SFU might want to say to a receiver "there is no audio here right now;
no comfort noise, no nothing; don't mix it until you hear differently from
me" instead of having a jitter buffer actively mixing silence for all
silent streams.


> Is this a requirement that would equally well be satisified by a direct
> API to the RTCP?
>

Yes, especially if new RTCP messages types can be added.


>
>
> - e2ee without the complexities of double SRTP.
>
> All of these have been brought up as use cases already by developers in
> responses to Sergio's survey
> https://docs.google.com/forms/d/1YVKqVU_ziCYtp8RGGnwB8WcQWDhkXe-mOmaSkFTdJm8/viewanalytics
> ).
>
>
> More generally, RTP is needlessly complex.  It's hard to add things, it's
> hard to change things, it's hard to debug things, and it's hard to
> understand things.  So in a sense, my use case is "cause me less complexity
> and pain".
>
>
> What I'm afraid of is the "those who don't understand X are doomed to
> reinvent it" syndrome - I've seen that play out before.
> That's why I'm hammering so hard on "what's the job that needs to be done".
>
> And yes, that was my trouble with interpreting the results of Sergio's
> survey too. I can't tell how to measure success, except by "number of
> complaints goes down".
>
>
>
>
>
>
>
>>
>> I agree with Sergio that this is a description of an implementation, not
>> an use case (and that it's useless for use cases that involve the RTP
>> ecosystem, which means that the use case you have in mind isn't one of
>> Sergio's use cases).
>>
>> --
>> Surveillance is pervasive. Go Dark.
>>
>>
>>
> --
> Surveillance is pervasive. Go Dark.
>
>
Received on Monday, 21 May 2018 21:10:50 UTC