Re: Aiding early implementations of the web audio API from Colin Clark on 2012-05-22 (public-audio@w3.org from April to June 2012)

From: Colin Clark <colinbdclark@gmail.com>
Date: Tue, 22 May 2012 15:27:34 -0400
To: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>, Chris Wilson <cwilso@google.com>
Cc: Marcus Geelnard <mage@opera.com>, public-audio@w3.org, Alistair MacDonald <al@signedon.com>
Message-Id: <873DF1FB-1C2F-4373-AC22-77C89AEE7EEF@gmail.com>
Hi all,

This is a great discussion. A few comments inline:

On Tue, May 22, 2012 at 8:55 PM, Chris Wilson <cwilso@google.com> wrote:
> The easiest interface would be just be to have an output device stream.  However, I think having a basic audio toolbox in the form of node types will cause an explosion of audio applications - building the vocoder example was illustrative to me, because I ended up using about half of the node types, and found them to be fantastically easy to build on.  Frankly, if they hadn't been there, I wouldn't have built the vocoder, because it would have been too complex for me to take on.  After working through a number of other scenarios in my mind, I'm left with the same feeling - having this set of node types fulfills most of the needs that I can envision, and the few I've thought of that aren't covered, I'm happy to use JS nodes for.  The only place where I'm personally not entirely convinced is that I think I would personally trade the DynamicsCompressorNode for an envelope follower node.  Maybe that's just because I'd rather hack noise gates, auto-wah effects, etc., without dropping into JS node.

Chris, I think it's great that you've had such a good experience creating cool demos with the building blocks provided by the Web Audio API. There are some really great features built right in, and I agree that they're quite powerful. I'm looking forward to seeing your vocoder demo!

That said, I think you'll find that as you continue to go deeper into synthesis and audio processing, you won't be able to avoid the need for new processing units that don't ship with the Web Audio API. For example, if you wanted to create a realistic-sounding model of an analog synthesizer, you'll need band-limited oscillators along the lines of:

http://www-ccrma.stanford.edu/%7Estilti/papers/blit.pdf

... as well as other novel types of filters and processing units not included in the current spec. To get a sense of the kind of building blocks provided by a sophisticated synthesis toolkit, have at what ships with popular development environments like SuperCollider and Max/MSP:

http://doc.sccode.org/Guides/Tour_of_UGens.html
http://cycling74.com/docs/max5/vignettes/core/msp_alphabetical.html

We can't possibly shoehorn these all into a spec that would be manageable for all browser vendors to implement, so it's clear that if we want to enable innovative new sounds on the Web, JavaScript nodes are going to be a critical part of it. The more that the spec can expose the fast underlying primitives of the Web Audio APIs' to the JavaScript author (FFTs, the convolution engine, etc.), as well as supporting worker-based synthesis, the better the experience will be for everyone.

>> Have any performance comparisons been made between the native nodes and their corresponding JavaScript implementations? I'm quite sure that native implementations will be faster (perhaps significantly in several cases), and I can also make some guesses as to which nodes would be actual performance bottle necks, but to what extent?
> 
> I don't think we've implemented everything twice, once in JavaScript and once in native code, and optimized their performance, no.  The best comparison would, I suppose, be any work that Robert did for effects in the MSP proposal.

I certainly have benchmarks for many of the unit generators I've implemented in the Flocking synthesis framework. Due to its architecture, though, I don't think I could write benchmarks in JavaScript to compare against the native AudioNode implementations. Perhaps a benchmarking suite would useful thing to have as part of a test suite for the spec?

On 2012-05-22, at 2:06 PM, Jussi Kalliokoski wrote:

>> Hmm.  I understand what you're suggesting, but I'm a little concerned that only handling tools to developers that say "perform a convolution on an arbitrary n-dimensional array of data" and hoping they figure out how to apply it to make reverb, as well as image blurring effects, is not the right approach.  I don't think everything should be roll it yourself from the bottom level.
> This is where JS libraries come in. There's already a variety of frameworks to make these concepts more easily approachable, all quite different with their pros and cons. If you look at web APIs in general, the common pattern is that the required features are specified, then different frameworks evolve and possibly a later effort is made to standardize some APIs that have become used widely enough so that it makes sense to standardize them, either to allow the frameworks to tap into performance benefits or just define parts of the frameworks that all the frameworks share as a standard. This approach allows for the "cows to pave the path", so that the web platform isn't overspecified with APIs that nobody wants to use (I'm not saying nobody wants to use Web Audio API, heh, however I think it would be better off as a JS library).

This is really well said, Jussi!. It's not an either/or situation as Chris is suggesting--either use the Web Audio API's built-in nodes or roll it yourself from the bottom level. If the spec is successful, there will be an array of libraries and tools available for developers to choose from, built on top of the base implementation provided by browsers. We should design the spec not only with end-users in mind, but also with the goal of fostering an innovative, diverse, and competitive environment for libraries.

Colin

---
Colin Clark
Technical Lead, Fluid Project
http://fluidproject.org
Received on Tuesday, 22 May 2012 19:28:29 UTC