Simplifying specing/testing/implementation work from Marcus Geelnard on 2012-07-19 (public-audio@w3.org from July to September 2012)

From: Marcus Geelnard <mage@opera.com>
Date: Thu, 19 Jul 2012 13:03:26 +0200
To: public-audio@w3.org
Message-ID: <op.whowz0kmm77heq@mage-desktop>
Hi group!

We have been over this many times before, but since some things are taking  
quite some time (getting the semantics detailed in the spec, getting  
started with test cases, making the API support more use cases etc) I'd  
like to get back to what Olivier brought up in [1], i.e. splitting the  
spec into two (or more) levels.

We could basically have the "core" part of the API as the most primitive  
level. I suppose it would include:

* AudioContext
* AudioNode
* JavaScriptAudioNode (new name, please)
* AudioDestinationNode
* AudioParam
* AudioBuffer

The rest, which would mostly fall under the category "signal processing",  
would be included in the next level (or levels).

This way we can start creating tests and doing implementation much faster,  
not to mention that the "core" spec will become much more manageable.

Now, if we make sure to "fix" the JavaScriptAudioNode so that it becomes a  
first class citizen (e.g. support for AudioParam, support for varying  
number of inputs/outputs/channels, worker-based processing, etc), most of  
the higher level functionality should be possible to implement using the  
JavaScriptAudioNode (except possibly MediaElementAudioSourceNode?).

Furthermore, I would like to suggest (as has been discussed before) that  
the Audio WG introduces a new API for doing signal processing on Typed  
Arrays in JavaScript. Ideally it would expose a number of methods that are  
hosted in a separate interface (e.g. named "DSP") that is available to  
both the main context and Web worker contexts, similarly to how the Math  
interface works.

I've done some work on a draft for such an interface, and based on what  
operations are typical for the Audio API and also based on some  
benchmarking (JS vs native), the interface should probably include: FFT,  
filter (IIR), convolve (special case of filter), interpolation, plus a  
range of simple arithmetic and Math-like operations.

The merits of such an API would be many:

* Very simple to specify, implement & test.
* It would bring JS-based processing performance pretty much to par with  
native AudioNodes.
* The specification of higher level AudioNodes could refer to the DSP spec  
for implementation details.
* As a Web developer you're free to customize AudioNodes if they do not  
fulfill all your needs, by re-implementing and extending them in JS, or  
even create new exciting nodes.
* You would be able to use the native DSP horsepowers of your computer for  
other things than the Audio API (e.g. for things like voice recognition,  
SETI@home-like applications, etc) without having to make ugly abuses of  
the AudioContext.
* The time-to-market for new Audio API functionality would be close to  
zero, since you can likely shim it using JS+DSP.

Any comments? Would this be a good strategy?


/Marcus



[1] http://lists.w3.org/Archives/Public/public-audio/2012AprJun/0388.html


-- 
Marcus Geelnard
Core Graphics Developer
Opera Software ASA
Received on Thursday, 19 July 2012 11:03:54 UTC