RE: active speaker information in mixed streams from Bernard Aboba on 2014-02-12 (public-webrtc@w3.org from February 2014)

From: Bernard Aboba <Bernard.Aboba@microsoft.com>
Date: Wed, 12 Feb 2014 17:24:13 +0000
To: Tim Panton new <thp@westhawk.co.uk>, Harald Alvestrand <hta@google.com>
CC: "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <553e091b516d45c984a055e4a822e80c@SN2PR03MB031.namprd03.prod.outlook.com>

Tim said: 

"Isn't this the sort of thing we should be delegating to the web-audio API ?
It is fully capable of doing this."

[BA] That is my take, at least for "dominant speaker" identification.   To my mind, CSRCs and averaged levels are only useful for indicating which sources are providing sound (or noise, as the case may be). 

If the goal is to enable switching video to the dominant speaker, then you actually need to figure out who is speaking (as opposed to typing on their keyboard, having their dog bark, etc.).  The web audio API is much better suited for that.

Received on Wednesday, 12 February 2014 17:24:59 UTC