W3C home > Mailing lists > Public > public-webrtc@w3.org > September 2015

Re: use cases around CSRC, client to mixer, mixer to client

From: Roman Shpount <roman@telurix.com>
Date: Fri, 4 Sep 2015 13:35:59 -0400
Message-ID: <CAD5OKxt7VWyrzrEzaMCuxxZ1bOZpW0vOersnEeQMhJyuLsmpYw@mail.gmail.com>
To: "Cullen Jennings (fluffy)" <fluffy@cisco.com>
Cc: public-webrtc <public-webrtc@w3.org>, Bernard Aboba <Bernard.Aboba@microsoft.com>
On Fri, Sep 4, 2015 at 1:09 PM, Cullen Jennings (fluffy) <fluffy@cisco.com>

> Use Case for JS to Read CSRC :
> So consider the case of an audio conferences where the audio bridge or MCU
> receives the audio from all participants but then selects some subset of
> the active speakers and mixes them into a single audio stream that is send
> out to the non active speakers. This is the most common form of
> conferencing today and reduces bandwidth over solutions that send the
> unmixed audio for each active speaker. Say Alice and Bob are the active
> speakers, the conference bridge takes the audio and mixed them and sends it
> but it indicates in the sent RTP packets the SSRC of Alice and Bob by
> putting those two SSRC into the CSRC list for the outbound RTP packet.
> The JS app out of band gets the SSRC and name of each user as they join.
> When it receives this RTP packet, it can look at the CSRC (if we have an
> API for that) and visually show in the roster list for the app that Alice
> and Bob are both currently speaking.
> The changes in the roster list need to be synchronized with the audio. So
> if three people say in sequence Yes, No, Yes, the roster should be
> displaying the name of the correct person as each person speaks. This
> allows people that don't recognize the voices to see who said yes and who
> said no.  That implies UI and audio synchronization timing requirements in
> the order of 100ms. Solutions that work by having the MCU tell the web
> server who the active speaker is, then the web server tells the GUI over
> websockets or something have not been able to reliably achieve a good user
> experience on this.  Solutions that look at the CSRC lists of the RTP being
> received easily meet that type of timing requirement.
> I think this use case is implementable with the GUI simply polling the
> current list of CSRC periodically (say every 50 ms) and updating the GUI if
> things have changed.

This can also be achieved using data channels with mixer sending the
speaker change notifications over the a data channel connection. Based on
my experience this works quite well and satisfies the timing requirements
for this use case.

Use Case for receiving Mixer to Client audio levels:
> Here is one related use use case - you have a multi user voice/video chat
> app for say 3 to 7 people in the same conference. It uses isolated media
> for privacy reasons and also for privacy reasons does not have a central
> mixer but instead creates a full mesh  of connection so each participant
> sends media to all other participants. Each participant plays all the audio
> of all participants but only display the video of the most resent person
> that started talking. The JS could look at the received client to mixer
> level to of the audio to decide what video to show. However this is looking
> at the client to mixer value not the mixer to client value.

This can also be implemented in the client JS application by analyzing
audio which is being sent for playback. I do not think there are a lot of
benefits in looking at the Client-to-mixer audio levels vs the audio itself.
Roman Shpount
Received on Friday, 4 September 2015 17:36:29 UTC

This archive was generated by hypermail 2.3.1 : Monday, 23 October 2017 15:19:46 UTC