- From: Roman Shpount <roman@telurix.com>
- Date: Fri, 4 Sep 2015 13:35:59 -0400
- To: "Cullen Jennings (fluffy)" <fluffy@cisco.com>
- Cc: public-webrtc <public-webrtc@w3.org>, Bernard Aboba <Bernard.Aboba@microsoft.com>
- Message-ID: <CAD5OKxt7VWyrzrEzaMCuxxZ1bOZpW0vOersnEeQMhJyuLsmpYw@mail.gmail.com>
On Fri, Sep 4, 2015 at 1:09 PM, Cullen Jennings (fluffy) <fluffy@cisco.com> wrote: > > Use Case for JS to Read CSRC : > > So consider the case of an audio conferences where the audio bridge or MCU > receives the audio from all participants but then selects some subset of > the active speakers and mixes them into a single audio stream that is send > out to the non active speakers. This is the most common form of > conferencing today and reduces bandwidth over solutions that send the > unmixed audio for each active speaker. Say Alice and Bob are the active > speakers, the conference bridge takes the audio and mixed them and sends it > but it indicates in the sent RTP packets the SSRC of Alice and Bob by > putting those two SSRC into the CSRC list for the outbound RTP packet. > > The JS app out of band gets the SSRC and name of each user as they join. > When it receives this RTP packet, it can look at the CSRC (if we have an > API for that) and visually show in the roster list for the app that Alice > and Bob are both currently speaking. > > The changes in the roster list need to be synchronized with the audio. So > if three people say in sequence Yes, No, Yes, the roster should be > displaying the name of the correct person as each person speaks. This > allows people that don't recognize the voices to see who said yes and who > said no. That implies UI and audio synchronization timing requirements in > the order of 100ms. Solutions that work by having the MCU tell the web > server who the active speaker is, then the web server tells the GUI over > websockets or something have not been able to reliably achieve a good user > experience on this. Solutions that look at the CSRC lists of the RTP being > received easily meet that type of timing requirement. > > I think this use case is implementable with the GUI simply polling the > current list of CSRC periodically (say every 50 ms) and updating the GUI if > things have changed. > This can also be achieved using data channels with mixer sending the speaker change notifications over the a data channel connection. Based on my experience this works quite well and satisfies the timing requirements for this use case. Use Case for receiving Mixer to Client audio levels: > > Here is one related use use case - you have a multi user voice/video chat > app for say 3 to 7 people in the same conference. It uses isolated media > for privacy reasons and also for privacy reasons does not have a central > mixer but instead creates a full mesh of connection so each participant > sends media to all other participants. Each participant plays all the audio > of all participants but only display the video of the most resent person > that started talking. The JS could look at the received client to mixer > level to of the audio to decide what video to show. However this is looking > at the client to mixer value not the mixer to client value. > This can also be implemented in the client JS application by analyzing audio which is being sent for playback. I do not think there are a lot of benefits in looking at the Client-to-mixer audio levels vs the audio itself. _____________ Roman Shpount
Received on Friday, 4 September 2015 17:36:29 UTC