W3C home > Mailing lists > Public > public-audio@w3.org > April to June 2012

Re: Aiding early implementations of the web audio API

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Tue, 22 May 2012 21:06:47 +0300
Message-ID: <CAJhzemW=sL9UuKu=f=ztsXhekraqU8HyeSg0N6YjQgVeLr6-4w@mail.gmail.com>
To: Chris Wilson <cwilso@google.com>
Cc: Marcus Geelnard <mage@opera.com>, public-audio@w3.org, Alistair MacDonald <al@signedon.com>
On Tue, May 22, 2012 at 8:55 PM, Chris Wilson <cwilso@google.com> wrote:

> On Tue, May 22, 2012 at 7:18 AM, Jussi Kalliokoski <
> jussi.kalliokoski@gmail.com> wrote:
>
>> On Tue, May 22, 2012 at 12:13 PM, Marcus Geelnard <mage@opera.com> wrote:
>>
>>> Yes, that's true. The filter and convolver nodes are certainly useful
>>> for game sound effects (underwater, cave, etc). Other nodes can also be
>>> used for creating interesting effects, but that's not the point.
>>>
>>> My view on the issue is that the more complex the spec is, the more it
>>> will cost (spec work, implementation time, test suits, bug fixing,
>>> source/binary sizes, etc), and the more likely it is that we will have
>>> different behavior in different implementations, ranging from noticeable
>>> performance differences to noticeable differences in sound and behavior and
>>> possibly implementation-dependent corner case bugs etc.
>>>
>>> Since all nodes can be implemented in JavaScript (most of them are even
>>> trivial), the only reason for using native nodes instead of JavaScript
>>> nodes is to improve performance.
>>>
>>
>> I have similar thoughts on this. I'm starting to think we're going to
>> have serious spec bloat if we go to lengths defining all the audio building
>> blocks that are required to achieve every use case, often resulting only in
>> awkward and hacky (no offends, these aren't easy things) solutions just to
>> avoid having to use a script node that would mitigate a lot of the benefits
>> in having a native graph. I feel that it would be a good idea to strip down
>> the spec a bit, making the script nodes more first-class citizens of the
>> graph, while reducing the need for effects that are simple to implement
>> with custom scripts. I think this would make the standardization effort a
>> lot simple as well.
>>
>
> I have to disagree with the definition of "trivial," then.  The only node
> types I think could really be considered trivial are Gain, Delay and
> WaveShaper - every other type is significantly non-trivial to me. And even
> then, when you layer on the complexity involved with handling AudioParams
> (for the gain on Gain and the delayTime on Delay), and the interpolation
> between curve points on WaveShaper, I'm not convinced they're actually
> trivial.
>
> The easiest interface would be just be to have an output device stream.
>  However, I think having a basic audio toolbox in the form of node types
> will cause an explosion of audio applications - building the vocoder
> example was illustrative to me, because I ended up using about half of the
> node types, and found them to be fantastically easy to build on.  Frankly,
> if they hadn't been there, I wouldn't have built the vocoder, because it
> would have been too complex for me to take on.  After working through a
> number of other scenarios in my mind, I'm left with the same feeling -
> having this set of node types fulfills most of the needs that I can
> envision, and the few I've thought of that aren't covered, I'm happy to use
> JS nodes for.  The only place where I'm personally not entirely convinced
> is that I think I would personally trade the DynamicsCompressorNode for an
> envelope follower node.  Maybe that's just because I'd rather hack noise
> gates, auto-wah effects, etc., without dropping into JS node.
>
> I've already proposed a few things that would make the script nodes more
>> first-class in my opinion. [1] [2]
>>
>
> I'm in favor of anything necessary to make JavaScript nodes a first-class
> citizen.
>
>
>> Have any performance comparisons been made between the native nodes and
>>> their corresponding JavaScript implementations? I'm quite sure that native
>>> implementations will be faster (perhaps significantly in several cases),
>>> and I can also make some guesses as to which nodes would be actual
>>> performance bottle necks, but to what extent?
>>>
>>
> I don't think we've implemented everything twice, once in JavaScript and
> once in native code, and optimized their performance, no.  The best
> comparison would, I suppose, be any work that Robert did for effects in the
> MSP proposal.
>
>
>> Those things said, I believe having a fast native convolution
>> implementation is critical to games and other applications, even with
>> native implementations convolution is a very expensive operation. But, that
>> said, and bear with me as I've said this before, it would be a good idea to
>> expose a function or a class (to keep state for performance optimizations)
>> to do convolution, rather than a node. This would be far more generally
>> useful, and I believe the browser environment could benefit from this in
>> other applications as well, such as image processing. Otherwise this will
>> end up redefined elsewhere. For convolution, this is even fairly simple to
>> do, because you could make it real-time just by taking advantage of the
>> overlap-add, as the current native implementation is, so that at simplest,
>> we could have a function that would take the output array, input array and
>> an array containing the kernels in frequency domain. Of course, for more
>> general use, we'll need to define how dimensions/channels are handled, etc.
>>
>
> Hmm.  I understand what you're suggesting, but I'm a little concerned that
> only handling tools to developers that say "perform a convolution on an
> arbitrary n-dimensional array of data" and hoping they figure out how to
> apply it to make reverb, as well as image blurring effects, is not the
> right approach.  I don't think everything should be roll it yourself from
> the bottom level.
>

This is where JS libraries come in. There's already a variety of frameworks
to make these concepts more easily approachable, all quite different with
their pros and cons. If you look at web APIs in general, the common pattern
is that the required features are specified, then different frameworks
evolve and possibly a later effort is made to standardize some APIs that
have become used widely enough so that it makes sense to standardize them,
either to allow the frameworks to tap into performance benefits or just
define parts of the frameworks that all the frameworks share as a standard.
This approach allows for the "cows to pave the path", so that the web
platform isn't overspecified with APIs that nobody wants to use (I'm not
saying nobody wants to use Web Audio API, heh, however I think it would be
better off as a JS library).

Cheers,
Jussi
Received on Tuesday, 22 May 2012 18:07:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 22 May 2012 18:07:46 GMT