Re: Simplifying specing/testing/implementation work from Jussi Kalliokoski on 2012-07-21 (public-audio@w3.org from July to September 2012)

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Sat, 21 Jul 2012 10:50:42 +0300
To: Chris Wilson <cwilso@google.com>
Cc: Raymond Toy <rtoy@google.com>, Marcus Geelnard <mage@opera.com>, public-audio@w3.org
Message-ID: <CAJhzemUvAw-4=LhgJkeYprkdHVRdynHq9oiQ7VhFiFr76ba=kg@mail.gmail.com>
Oops, heh, forgot to link:

[1] https://dvcs.w3.org/hg/FXTF/raw-file/tip/filters/index.html

On Sat, Jul 21, 2012 at 10:28 AM, Jussi Kalliokoski <
jussi.kalliokoski@gmail.com> wrote:

> I'd just like to point out that nobody is forcing you to have to do
> complex things - this is what frameworks are for. If we provide the
> discussed tools, people can simply extend the graph to have all the nodes
> it's specified to have now and it makes no difference from developer point
> of view, aside from including a library in the project. A library that
> might just suit their needs much better than the original API, after all
> it's quite established even here that people have very different tastes for
> frameworks and want things to work differently.
>
> Think of all the possibilities that open up once we provide a
> high-performance native DSP library that isn't tied to audio. We get all
> those benefits for video (decoding realtime HD video in JS today is next to
> impossible, but we might just give it a hand) and picture as well (fast
> convolution for images and you no longer get 1fps if you have a blur effect
> in a canvas). Things like crowd-sourced vaccination calculations get a
> whole new meaning, you could run them in your browser, and fast. We'd open
> up a door for innovation of custom DSP languages, they'd be really awkward
> — if even possible — to design on top of a high-level node-based processing
> API (there's a reason in software history that low-level APIs prevail, by
> the way). Not to mention the things that haven't been thought about yet.
> Having a low-level DSP API is just the way to go.
>
> But if we provide both a DSP library for typed arrays and the specialized
> nodes it's very much a duplication of work and hardly justifiable.
>
> The "core" part of the graph, however, is reasonable to preserve. It
> provides a reasonable representation for what we need and is extensible
> both by developers and us. And by extensible to developers I mean that they
> can use the graph with their custom nodes. By extensible to us I mean that
> we can add more types of graph inputs (microphones, we already have media
> elements), we can extend the API to provide more information about the
> user's output device, or hopefully even using multiple output devices,
> without having to break things.
>
> AudioParams are also a useful tool as they simplify the communication with
> the nodes of the graph, having to postMessage control changes to a worker
> is not only awkward but it can be slow as well.
>
> I think any specialized processing node just begs the question "why not
> this or that feature, too?" Just look at what happened to FFMPEG, heh. Or
> CSS Filter Effects [1] for that matter. With low-level primitives, we can
> just say there's this general tool, use it, with high-level APIs we can say
> "yeah, we can always add more special cases..." or just "no, sorry, you
> can't do that". It's a rat's nest, I tell yoo! ^^
>
> Cheers,
> Jussi
>
>
> On Thu, Jul 19, 2012 at 8:55 PM, Chris Wilson <cwilso@google.com> wrote:
>
>> I'd like to request that we not plan any grand changes here until Chris
>> is back from vacation (end of the month).  I'd also like to explicitly
>> separate my opinion detailed below from his, since we are coming at the API
>> from distinctly different angles (I'm mostly a consumer of the API, he's an
>> API designer) and backgrounds (he's an audio engineering expert, and I'm a
>> hack who likes playing around with things that go bing!), and despite both
>> working for Google, aren't always in agreement.  :)
>>
>> My opinion- in short, I oppose the idea of having a "core spec" as
>> captured above.  I think it will simply become a way for implementers to
>> skip large parts of the API, while causing confusion and compatibility
>> problems for developers using the API.
>>
>> I think considering JSNode* as the core around which most audio apps will
>> be built is incorrect.  I've now built a half-dozen relatively complex
>> audio applications - the Vocoder <http://webaudiovocoder.appspot.com/>,
>> the Web Audio Playground <http://webaudioplayground.appspot.com/>, my in-progress
>> DJ deck <http://cwilsotest.appspot.com/wubwubwub/index.html>, a couple
>> of synthesizers, and a few others I'm not ready to show off yet.  If I had
>> to use JS node to create my own delays, filters by setting up my own FFT
>> matrices, etc., quite frankly I would be off doing something else.  I think
>> recognizing these features as basic audio tools is critical; the point of
>> the API, as I've gotten to know it, is to enable powerful audio
>> applications WITHOUT requiring a degree in digital signal processing.  In
>> the Web Audio coding I've done, I've used JSNode exactly once - and that
>> was just to test it out.  I have found zero need for it in the apps I've
>> built, because it's been more performant as well as far, far easier to use
>> tools provided for me.
>>
>> If the "core spec" is buffers, JSNodes, and AudioNode, I see this as an
>> ultimately futile and delaying tactic for getting powerful audio apps built
>> by those without - very much like we had a "CSS1 Core" spec for a while.
>>  If the goal is simply to expose the audio output (and presumably input)
>> mechanism, then I'm not sure why an AudioData API-like write() API is not a
>> much simpler solution - if there's no other node types than JSNode, I'm not
>> sure what value the Node routing system provides.
>>
>> Ultimately, I think a lot of game developers in particular will want to
>> use the built-in native processing.  If the AudioNode types like Filter and
>> Convolver aren't required in an implementation, then either we are creating
>> a much more complex compatibility matrix - like we did with CSS1 Core, but
>> worse - or they won't be able to rely on those features, in which case I'm
>> not sure why we have a routing system.
>>
>> That said - I do agree (as I think Chris does also) that JSNode isn't
>> where it needs to be.  It DOES need support for AudioParam, support for
>> varying number of inputs/outputs/channels, and especially worker-based
>> processing.  But just because it COULD be used to implement DelayNode
>> doesn't mean DelayNode shouldn't be required.
>>
>> I'm also not opposed to a new API for doing signal processing on Typed
>> Arrays in JavaScript.  But again, I'd much rather have the simple interface
>> of BiquadFilterNode to use than having to implement my own filter via that
>> interface - I see that as a much more complex tool, when I NEED to build my
>> own tools.
>>
>> All this aside, I do believe the spec has to clearly specify how to
>> implement interoperable code, and I recognize that it is not there today.
>>
>> -Chris
>>
>> *I use "JSNode" as shorthand for "programmable node that the developer
>> has to implement themselves" - that is, independent of whether it's
>> JavaScript or some other programming language.
>>
>> On Thu, Jul 19, 2012 at 9:44 AM, Raymond Toy <rtoy@google.com> wrote:
>>
>>>
>>>
>>> On Thu, Jul 19, 2012 at 7:11 AM, Jussi Kalliokoski <
>>> jussi.kalliokoski@gmail.com> wrote:
>>>
>>>>
>>>> Obviously SIMD code is faster than addition in JS now, for example. And
>>>> yes, IIR filter is a type of a convolution, but I don't think it's possible
>>>> to write an efficient IIR filter algorithm using a convolution engine —
>>>> after all, a convolution engine should be designed to deal with a FIRs. Not
>>>> to mention that common IIR filters have 4 (LP, HP, BP, N) kernels, which
>>>> would be really inefficient for a FastConvolution algorithm, even if it
>>>> supported FIR. And as far as IIR filter performance goes, I think SIMD
>>>> instructions offer very little usefulness in IIR algorithms, since they're
>>>> so linear.
>>>>
>>>>
>>>  https://bugs.webkit.org/show_bug.cgi?id=75528 says that adding SIMD
>>> gives a 45% improvement.
>>>
>>> Ray
>>>
>>
>>
>
Received on Saturday, 21 July 2012 07:51:10 UTC