Re: Suggestion for minimizing audio glitches

On 4/16/2012 4:10 PM, Chris Rogers wrote:
>
>
> On Sun, Apr 15, 2012 at 12:22 PM, <lemeslep@free.fr 
> <mailto:lemeslep@free.fr>> wrote:
>
>     On the current Web Audio draft, it is mentionned in ยง15.2 that
>     "Audio glitches are caused by an interruption of the normal
>     continuous audio stream, resulting in loud clicks and pops. It is
>     considered to be a catastrophic failure of a multi-media system
>     and must be avoided."
>     And I can't agree more with this!
>     I'm currently facing those ugly audio glitches in my project. I'm
>     using Mozilla's Audio Data API at the moment, and I think I know
>     how browsers could help me to mitigate this problem.
>
>     The clicks and pops are happening because if the audio buffer is
>     underrun by the javascript app, the audio card is not feeded
>     anymore, and so the card output goes straight from the value of
>     the last sample played to 0.
>     What would be needed is, perhaps as an option in the Javascript
>     audio node (?), to have the browser automatically feed the audio
>     card by sustaining the last sample the javascript application
>     sent, when the audio buffer is underrun.
>
>     That would really go a long way towards minimizing this critical
>     issue.
>
>
> Hi Philippe, I don't think this will help with the glitches.  Using 
> this approach, an under-run will still be quite audible.  And it's not 
> a good idea to send a constant (non-zero) value out to the audio 
> hardware since this represents a "DC offset" and can cause even worse 
> problems.

Since underruns may happen no matter what you do (especially if 
main-thread JS is involved), it's best to minimize the impact of them.  
On an underrun, the primary options are:

1) send 0's (which generally is the audio device default if you don't 
feed it) - clicks/pops
2) repeat last sample - classic lost-packet basic VoIP technique; works 
ok in most cases; requires blending at start/end to avoid click/pop.  
Often done at a reduced volume which makes it less noticable.
3) decay - take last sample and decay it to silence to avoid click/pop - 
more useful if you expect continued lack of source.  Can be variant of 
#2 where you progressively decay each missing frame.
4) fancier VoIP-style packet loss concealment - better than #2; may tend 
to be voice-centric
5) fancier loss concealment using non-voice centric prediction (waving 
hands here; I'm sure such things exist for good CD/DVD/etc players).

-- 
Randell Jesup
randell-ietf@jesup.org

Received on Monday, 16 April 2012 21:35:39 UTC