- From: Chris Rogers <crogers@google.com>
- Date: Sat, 7 Aug 2010 14:10:01 -0700
- To: moallem@mit.edu
- Cc: Chris Marrin <cmarrin@apple.com>, public-xg-audio@w3.org, Corban Brook <corbanbrook@gmail.com>
- Message-ID: <AANLkTime8cz6i_oLi0+qccqnJb+dENXrXe4pb=bsLVt4@mail.gmail.com>
When discussing native versus JS performance, FFTs are just one convenient place to make comparisons. But, of course, there are many other types of audio processing algorithms not involving FFTs or spectral analysis at all, each with different performance characteristics. You gave an example of using low-order filters instead of HRTF convolution for spatialization. This is a real-world trade-off which can be made for rendering quality versus speed, and one which is offered, for example, in Apple's OpenAL implementation. There will certainly be many applications where pure JS processing will be acceptable. Several factors are important to consider when comparing with native processing: * * *Hardware Scalability* As far as I know, we haven't measured CPU performance with pure JS processing on phone and tablet devices yet, but we should get some numbers here. Roughly speaking though, we should not expect these devices to have as much headroom as desktop class machines. *Graphics and other Game Logic (Physics)* For games which are doing a significant amount of canvas or WebGL drawing and which may be doing other non-trivial things in JS such as running physics engines, there will be significantly less JS resources left for audio processing. Conversely, doing any significant audio processing will cause dropped frames in the graphics rendering, producing less smooth animations for drawing on the page as compared with native audio processing. *Latency* *Does the application have low-latency requirements? For some applications the time delay between when a key is pressed or a mouse event is processed and a sound is heard is important. For audio-intensive applications, it will likely be necessary to increase buffering to avoid audible glitches. This will increase latency. Here's an interesting link talking about Flash 10 audio latency:* http://joeberkovitz.com/blog/2008/10/15/controlling-audio-latency-in-flash-10/ This is interesting because Flash uses ActionScript which is very similar to JavaScript. Native audio processing can achieve the very best possible latency and will not suffer from these problems. * * *High-level API* *The canvas drawing API has high-level functions for drawing things such as circles, lines, and gradients. Although it's possible to use an ImageData and poke pixels directly into the bitmap to achieve the same effect, it seems like the higher-level drawing APIs are very useful to have. Similarly, in audio there are fundamental operations which are very common (mixing, filtering, panning, delay and other linear effects, etc.) and providing direct and simple APIs for these common operations also seems very useful to me.* ============================================================================== *Some Use Cases:* Here are two examples where I think native processing would be attractive: *3D Games* For games, implementing a room effect (algorithmic reverberation or convolution), and spatializing multiple moving sources (whether with convolution or other techniques), sound cones, distance effects, occlusion and obstruction effects per-source is costly in terms of CPU usage. Even the very most basic algorithmic reverberation effects in pure JS are very expensive and are not of the same quality or versatility as native convolution effects. Also, latency is often important in these types of applications. *DAW Applications* DAW (digital-audio workstation) type applications, when you start to consider multiple audio sources, with multiple insert effects on each source, send effects, submixes, parameter automation with de-zippering, etc. then the computational demands can become fairly significant. Here are two examples where I think direct JS processing would be attractive: *Custom DSP Effects* Unusual and interesting custom audio processing can be done directly in JS. It's also a good test-bed for prototyping new algorithms. This is an extremely rich area. *Educational Applications* JS processing is ideal for illustrating concepts in computer music synthesis and processing, such as Corban's example of the de-composition of a square wave into its harmonic components, FM synthesis techniques, etc. ============================================================================== *Hybrid Approaches* Because of the modular nature of the AudioNode approach, it's easily possible to combine both native processing and direct JS processing. Some applications may benefit from using both approaches at the same time and can get the best of both worlds. Best Regards, Chris On Sat, Aug 7, 2010 at 8:37 AM, Ted Moallem <ted.moallem@gmail.com> wrote: > We are discussing FFT analysis as though it were the one solution for > all of our spectral processing needs. The algorithm is convenient > when the "overhead" is tolerable, but considering the variety of > web-capable platforms out there, the future of web audio might best be > served by steering developers toward more efficient signal processing > schemes, tailored to task requirements. For example, does 3D > spatialization require hundreds of large-sized FFT's per second, or > can it be accomplished using low-order Butterworth filters with > time-varying coefficients? (If anyone knows the answer, please feel > free to chime in.) In any case, rather than considering whether or > not javascript can handle the requirements of a modern-day first > person shooter game (or peaceful bird-watcher game), I regard > javascript audio API as a challenge to work within reasonable > constraints where possible, to meet our web audio needs without > monopolizing all of a devices resources? > > [descends from soapbox] > > -ted > > > > > On Sat, Aug 7, 2010 at 3:25 AM, Chris Rogers <crogers@google.com> wrote: > >> > Very nice. Overhead on my machine is very low (20%) and I think at > least > >> > half that overhead is WebGL rendering. It would be nice to duplicate > the > >> > functionality of the Realtime Ananyzer demo so we can understand the > >> > difference in overhead between doing FFT's in JS vs native code. > >> > > >> > ----- > >> > ~Chris > >> > cmarrin@apple.com > >> > > >> > Sure, I can do that. I know that the Mozilla folks have already done > >> > this and found the JS FFT performance to be acceptable for realtime > >> > analysis. Where the FFT overhead gets quite a lot heavier is in > >> > panning/spatialization and convolution where there are hundreds of > larger > >> > sized FFTs per second. > >> > >> It would be useful to have an apples-to-apples comparison. What sample > >> rate does the Realtime Analyzer demo use? It would be nice to do a test > of > >> 48KHz stereo, just to see how much we can stress it in JS. > >> > >> ----- > >> ~Chris > >> cmarrin@apple.com > > > > It normally runs at 44.1KHz. I think they were getting results like > 0.4ms > > per size 1024 FFT, which is not very heavy for doing a fairly standard > > real-time analysis. When I get some time, I can try to see what results > we > > get for JSC and V8 in WebKit. I imagine we'll see something similar. > > Chris > > > > > > -- > ________________________________ > > Theodore Moallem > moallem@mit.edu > 646-872-0283 > > Sensory Communication Group > Research Laboratory of Electronics at MIT > Graduate Program in Speech & Hearing Bioscience and Technology > Harvard-MIT Division of Health Sciences and Technology > > > -- > > "My instincts always tend to revolt against helplessness of any kind." -- > NK >
Received on Saturday, 7 August 2010 21:10:48 UTC