Re: Simplifying specing/testing/implementation work from Marcus Geelnard on 2012-07-19 (public-audio@w3.org from July to September 2012)

From: Marcus Geelnard <mage@opera.com>
Date: Thu, 19 Jul 2012 14:04:03 +0200
To: "olivier Thereaux" <olivier.thereaux@bbc.co.uk>
Cc: public-audio@w3.org
Message-ID: <op.whozs1x9m77heq@mage-desktop>
Den 2012-07-19 13:28:40 skrev olivier Thereaux  
<olivier.thereaux@bbc.co.uk>:

> Hi Marcus, thanks for bringing this discussion back to light. I realise  
> I should have pushed for it a little more before we started the whole  
> rechartering process…
>
> On 19 Jul 2012, at 12:03, Marcus Geelnard wrote:
>> We could basically have the "core" part of the API as the most  
>> primitive level.
>> […]
>> The rest, which would mostly fall under the category "signal  
>> processing", would be included in the next level (or levels).
>>
>> This way we can start creating tests and doing implementation much  
>> faster, not to mention that the "core" spec will become much more  
>> manageable.
>
> Yes, I'd be curious to hear from members currently looking at  
> implementing the API about this. I am quite positive about the idea of  
> splitting the spec into core and modules (or levels) in principle.  
> However, the split, if any, has to 1) make architectural sense and 2)  
> not create such a complex net of dependencies that each spec will wait  
> for the others before progressing through the standard process.

I agree. We have to be careful so that the split actually eases work  
instead of the other way around, and also the core level must provide  
enough functionality to actually make it possible to produce any useful  
sound with it.

>> Furthermore, I would like to suggest (as has been discussed before)  
>> that the Audio WG introduces a new API for doing signal processing on  
>> Typed Arrays in JavaScript. Ideally it would expose a number of methods  
>> that are hosted in a separate interface (e.g. named "DSP") that is  
>> available to both the main context and Web worker contexts, similarly  
>> to how the Math interface works.
>>
>> I've done some work on a draft for such an interface, and based on what  
>> operations are typical for the Audio API and also based on some  
>> benchmarking (JS vs native), the interface should probably include:  
>> FFT, filter (IIR), convolve (special case of filter), interpolation,  
>> plus a range of simple arithmetic and Math-like operations.
>
> This has been floated a few times indeed. Again, the big question for me  
> is whether layering specs would be wonderful in principle, but  
> horrendous to implement and bad for performance.

Regarding the DSP API suggestion, I wouldn't mind making it part of the  
core specification if that makes the process simpler. Logically, though,  
it feels as a separate spec (just managed by the Audio WG). Also, the  
notion of using it as a reference for the higher level functions was just  
an idea that would make sense IF the DSP spec would have a faster path  
than the Audio API as a whole (plus, it's quite logical since the two APIs  
would typically have to do similar or identical things semantically).

>> * You would be able to use the native DSP horsepowers of your computer  
>> for other things than the Audio API (e.g. for things like voice  
>> recognition, SETI@home-like applications, etc) without having to make  
>> ugly abuses of the AudioContext.
>
> Would video processing also be a use case for this? Do we know of other  
> groups for which this would solve one of their needs? Do we know of any  
> similar work being done?

Actually, I've looked at the possibilities of bringing in support for 2D  
and 3D too, but I fear that there is a risk of premature API bloating if  
we do so (we might end up with something like the massive number of  
functions found in Matlab for instance [1], not to mention awkward layout  
interpretations of typed arrays). Also, more ambitious projects like  
Rivertrail and WebCL might be a better fit for 2D/3D data/signal  
processing.

I think that the notion of a DSP object makes most sense from an audio  
perspective, especially considering that the Audio API will require  
implementors to support a wide range of 1D signal processing operations  
anyway. It just so happens that the operations are currently accessed  
through AudioNode interfaces - my proposal is to expose those operations  
to JavaScript too.

Regards,

   Marcus



[1] http://www.mathworks.se/help/techdoc/ref/f16-48518.html

-- 
Marcus Geelnard
Core Graphics Developer
Opera Software ASA
Received on Thursday, 19 July 2012 12:04:37 UTC