RE: Mozilla/Cisco API Proposal from Koen Vos on 2011-07-13 (public-webrtc@w3.org from July 2011)

From: Koen Vos <koen.vos@skype.net>
Date: Thu, 14 Jul 2011 01:54:04 +0200 (CEST)
To: public-webrtc@w3.org
Cc: Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>
Message-ID: <1136485245.3263772.1310601244922.JavaMail.root@lu2-zimbra>

Stefan Håkansson wrote:
> More and more can be done by analysing the input signal (e.g. 
> determining if it is speech or music), so perhaps there will be no 
> need for API support.

That may work in the long term.  But Opus currently has no speech/music detector, and I think it will take a while to build one that is good enough for most use cases.  So for now the API seems the only way we could set the Opus mode.  

What are you actually proposing: to hard code the Opus mode, or to quickly invent a reliable speech/music detector?

best,
koen.


Stefan Håkansson wrote:

>>> could help the codec perform at its optimum). And this set could be
>>> irrelevant for a new generation of codecs. "audio" vs "voip" is just one
>>> example, and it is specific for one codec. I think the general trend also is
>>
>>On the contrary, things like AGC and noise suppression are independent
>>of _any_ codec (at least they are in the WebRTC stack Google
>>open-sourced). Opus implements a few more things internally, but there's
>>no reason in principle why those things couldn't be done outside the
>>codec as well. The point is that this switch is the difference between,
>>"Please actively distort the input I give you to make it sound
>>'better'," vs. "Please preserve the original input as closely as
>>possible," and that semantic has little to do with the actual codec.
>
>I still think we should not go in this direction - at least not initially. Let's add it later if there is a clear need. More and more can be done by analysing the input >signal (e.g. determining if it is speech or music), so perhaps there will be no need for API support.
>
>Stefan

Received on Saturday, 16 July 2011 20:15:28 UTC