W3C home > Mailing lists > Public > public-audio@w3.org > July to September 2012

Re: Help needed with a sync-problem

From: Chris Wilson <cwilso@google.com>
Date: Mon, 6 Aug 2012 11:48:07 -0700
Message-ID: <CAJK2wqUP33sEtQJ6S0u4TVDM+66vNv3o7uA4s+3B0TMc7U0hAg@mail.gmail.com>
To: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Cc: Peter van der Noord <peterdunord@gmail.com>, public-audio@w3.org
On Mon, Aug 6, 2012 at 11:06 AM, Jussi Kalliokoski <
jussi.kalliokoski@gmail.com> wrote:

> On Mon, Aug 6, 2012 at 8:09 PM, Chris Wilson <cwilso@google.com> wrote:
>> I guess what I'm trying to get at is there's a huge difference between "I
>> want to create my own programmatic modules" and "I want to create my own
>> nodes."  The vocoder, for example, is essentially a bunch of programmatic
>> modules plugged together; however, it doesn't use JSNodes at all.
> Funny that you should mention these things:
>  * Ease of use of the API.
>  * Performance benefits.
>  * The possibility of creating almost any system with native nodes.
> To me, the last point counteracts both of the former ones. Essentially,
> what you're suggesting is to make software developers think of their audio
> systems in terms of electronics and that everything can be made out of
> these components. While true, this is software and the API is going to be
> used by software developers. That means that making them think in terms of
> electronics rather than software, there's hardly any point to be made for
> ease of use. Not to mention performance.

Not really.  The actual electronics to implement even BiquadFilterNode
would be substantial, when you include the AudioParam inputs.  Even more so
for the RealtimeAnalyserNode, or Oscillator, or... It's not really
appropriate to think of it as "electronics," but it is probably appropriate
to think of it (in my opinion) as a set of low-level modules.  I have a
compressor, a stand-alone filter, a delay unit, etc., in my music rack at
home... and I think that's a reasonable model.

Your vocoder is a good example, actually. Don't take me wrong, it's a
> really cool demo. But if you compare the complexity of implementing it with
> a JavaScriptNode and DSP API, the difference is astonishing.
> In mathematical terms you could define a vocoder as `output =
> IFFT(FFT(window(input)) * FFT(window(carrier)))` and an implementation
> would be a few lines of code, whereas your vocoder is a few hundred! And
> that's even before thinking about performance or accuracy.

"Accuracy" is an interesting term to choose.  I spent the bulk of my time
developing the vocoder tweaking parameters and gain levels, as it's not as
simple as that to make a vocoder that sounds good.  I actually started out
attempting to use the RealtimeAnalyser to do something somewhat like what
you're suggesting - however, it turns out vocoders really need to be more
"musical" (e.g. carefully chosen filter bands, and logarithmic frequency
band centers based on octaves, not just linear bands), and that approach
won't end up sounding very good unless you use a quite large FFT[1].
 Incidentally, just over half of the nodes [literally!] could have been
replaced with a single envelope-following node per band, which is partly
why I have pressed for such a node (as well as the usual input monitoring

I'm pretty certain an implementation even in pure JavaScript (without the
> DSP API) would outperform the setup, and even exponentially when you
> increase the number of frequency bands used.

Code or it didn't happen.  :)  Seriously, though, I'm skeptical; not
because I don't think you could define a naive[2] vocoder as you do above,
nor that a simple IFFT(FFTxFFT) implementation might not be faster than
what I currently have - honestly, it probably would, because there are some
knowingly very poorly performant approaches in my band construction (if I
ever get a few spare moments - like changing the band's envelope follower
and RTA per band for the graph display to DynamicsCompressorNodes, just to
see if that would work, is high on my list).

However, I would be INCREDIBLY surprised if you could use this approach to
build a highly musical vocoder that was more performant than the
native-node approach once optimized, particularly if we add an envelope
follower (really, just a reduction audio output on dynamicsCompressor would
probably work for me).


[1] The hard-coded defaults (you can change them in code easily in setup,
but I didn't offer UI for it) for my vocoder demo is for 28 bands, from
55Hz to 7040Hz (A1-A8), assigned as 4 bands per octave.  The bottom band is
about 6Hz wide, if memory serves; and I'm using 4th order filters (two
bandpass filters stacked).  In my initial
I used a 2048 fftSize, for an only 10-band vocoder.  For the upper vocoder
band, I had to sum a LOT of FFT bands.

[2] Note the description of my Github repo is "Naive Web Audio Vocoder" -
I'm not trying to be rude.  I was naive when I started the project; I'm
somewhat less naive about vocoders now.  If I spent a couple more years
tweaking, I'd probably be a pro.  :)
Received on Monday, 6 August 2012 18:48:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 6 August 2012 18:48:38 GMT