- From: Chris Wilson <cwilso@google.com>
- Date: Mon, 6 Aug 2012 11:48:07 -0700
- To: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
- Cc: Peter van der Noord <peterdunord@gmail.com>, public-audio@w3.org
- Message-ID: <CAJK2wqUP33sEtQJ6S0u4TVDM+66vNv3o7uA4s+3B0TMc7U0hAg@mail.gmail.com>
On Mon, Aug 6, 2012 at 11:06 AM, Jussi Kalliokoski < jussi.kalliokoski@gmail.com> wrote: > On Mon, Aug 6, 2012 at 8:09 PM, Chris Wilson <cwilso@google.com> wrote: > >> I guess what I'm trying to get at is there's a huge difference between "I >> want to create my own programmatic modules" and "I want to create my own >> nodes." The vocoder, for example, is essentially a bunch of programmatic >> modules plugged together; however, it doesn't use JSNodes at all. >> > > Funny that you should mention these things: > * Ease of use of the API. > * Performance benefits. > * The possibility of creating almost any system with native nodes. > > To me, the last point counteracts both of the former ones. Essentially, > what you're suggesting is to make software developers think of their audio > systems in terms of electronics and that everything can be made out of > these components. While true, this is software and the API is going to be > used by software developers. That means that making them think in terms of > electronics rather than software, there's hardly any point to be made for > ease of use. Not to mention performance. > Not really. The actual electronics to implement even BiquadFilterNode would be substantial, when you include the AudioParam inputs. Even more so for the RealtimeAnalyserNode, or Oscillator, or... It's not really appropriate to think of it as "electronics," but it is probably appropriate to think of it (in my opinion) as a set of low-level modules. I have a compressor, a stand-alone filter, a delay unit, etc., in my music rack at home... and I think that's a reasonable model. Your vocoder is a good example, actually. Don't take me wrong, it's a > really cool demo. But if you compare the complexity of implementing it with > a JavaScriptNode and DSP API, the difference is astonishing. > > In mathematical terms you could define a vocoder as `output = > IFFT(FFT(window(input)) * FFT(window(carrier)))` and an implementation > would be a few lines of code, whereas your vocoder is a few hundred! And > that's even before thinking about performance or accuracy. > "Accuracy" is an interesting term to choose. I spent the bulk of my time developing the vocoder tweaking parameters and gain levels, as it's not as simple as that to make a vocoder that sounds good. I actually started out attempting to use the RealtimeAnalyser to do something somewhat like what you're suggesting - however, it turns out vocoders really need to be more "musical" (e.g. carefully chosen filter bands, and logarithmic frequency band centers based on octaves, not just linear bands), and that approach won't end up sounding very good unless you use a quite large FFT[1]. Incidentally, just over half of the nodes [literally!] could have been replaced with a single envelope-following node per band, which is partly why I have pressed for such a node (as well as the usual input monitoring scenarios). I'm pretty certain an implementation even in pure JavaScript (without the > DSP API) would outperform the setup, and even exponentially when you > increase the number of frequency bands used. > Code or it didn't happen. :) Seriously, though, I'm skeptical; not because I don't think you could define a naive[2] vocoder as you do above, nor that a simple IFFT(FFTxFFT) implementation might not be faster than what I currently have - honestly, it probably would, because there are some knowingly very poorly performant approaches in my band construction (if I ever get a few spare moments - like changing the band's envelope follower and RTA per band for the graph display to DynamicsCompressorNodes, just to see if that would work, is high on my list). However, I would be INCREDIBLY surprised if you could use this approach to build a highly musical vocoder that was more performant than the native-node approach once optimized, particularly if we add an envelope follower (really, just a reduction audio output on dynamicsCompressor would probably work for me). -Chris [1] The hard-coded defaults (you can change them in code easily in setup, but I didn't offer UI for it) for my vocoder demo is for 28 bands, from 55Hz to 7040Hz (A1-A8), assigned as 4 bands per octave. The bottom band is about 6Hz wide, if memory serves; and I'm using 4th order filters (two bandpass filters stacked). In my initial implementation<https://github.com/cwilso/Vocoder/blob/b00dbd08a6a9744923efab93d1cf3a1dc6aeaa38/analyser.js>, I used a 2048 fftSize, for an only 10-band vocoder. For the upper vocoder band, I had to sum a LOT of FFT bands. [2] Note the description of my Github repo is "Naive Web Audio Vocoder" - I'm not trying to be rude. I was naive when I started the project; I'm somewhat less naive about vocoders now. If I spent a couple more years tweaking, I'd probably be a pro. :)
Received on Monday, 6 August 2012 18:48:37 UTC