Re: Comments on the Web Audio API proposal from Chris Rogers on 2010-10-18 (public-xg-audio@w3.org from October 2010)

From: Chris Rogers <crogers@google.com>
Date: Mon, 18 Oct 2010 12:40:37 -0700
To: Ricard Marxer Piñón <ricardmp@gmail.com>
Cc: public-xg-audio@w3.org
Message-ID: <AANLkTi=4PDU_fMD8AH1WDXs4Q88Y+Wwh79EKaSarUDAE@mail.gmail.com>
On Mon, Oct 18, 2010 at 4:02 AM, Ricard Marxer Piñón <ricardmp@gmail.com>wrote:

> Hi Chris,
>
> Thanks for the response.  I understand better the reasons of your
> choices.  See below some open questions or possible alternatives.
>
> >>
> >> Thoughts about the RealtimeAnalyzer
> >> ------------------------
> >> As I have expressed earlier I think this is quite a vague node that is
> >> very specific to visualization.  I think a much better node (with a
> >> more determined behavior) would be an FFTNode.  This node would simply
> >> perform an FFT (would be also important to allow it to perform the
> >> IFFT).  And give access to the magnitude and phase (or real and
> >> imaginary).  This node would be extremely useful not only for
> >> visualization, but for analysis, synthesis and frequency domain
> >> effects.
> >
> > If we decide to implement an FFTNode and IFFTNode, then we would also
> have
> > to invent several interesting intermediate AudioNodes which process in
> the
> > frequency domain.  What would these nodes be?
>
> I think this is not really necessary.  We could just have a
> JavaScriptFFTProcessorNode (ok, not the best name) or something
> similar that would take as input the real and imaginary parts of the
> spectrum (or magnitude and phase).  And we would just need to connect
> it in the following way:
>
> FFTNode -> JavaScriptFFTProcessorNode -> IFFTNode
>
> Then someone can use this processor node to modify or visualize the
> FFT using JavaScript.
>

If we have such an API, then wouldn't it be easier to just have
the JavaScriptFFTProcessorNode automatically do the FFT and IFFT, so then we
wouldn't need the FFTNode and the IFFTNode?  The JavaScriptFFTProcessorNode
would need to be created with the following attributes:

* FFT size

* step size (for overlapping FFT windows)
For small "step sizes" (for example 8x overlapping windows) there may be
difficulties in the JS event listener getting called frequently enough since
it could be a very fast callback rate.  They might "beat" against the timers
for the graphics animation.

* window type (Hamming, Blackman, etc.)


Although I do appreciate that it's more efficient to do the FFT and IFFT in
native code, it looks like you're proposing to manipulate the complex
analysis data directly in JavaScript.  And any non-trivial algorithm (such
as time-stretching) for processing each frame could easily take as much time
or more than doing the FFT and IFFT.  It would be good to have some examples
coded in JavaScript (at first doing the FFT and IFFT directly in JS).  Then
we can try to measure the performance of the various parts of the processing
to see how much benefit we would get.  Another possibility for you would be
to hack into the WebKit audio code and add JavaScriptFFTProcessorNode
yourself and compare performance to the purely JS version.

One of the reasons I'm pushing back a little is because there's a cost to
every new API which is added to the audio specification.  It's a cost
involving the complexity of the specification process and getting working
implementations in various browsers.  I think that the AudioNodes which
exist so far are fairly standard audio building-blocks which are very likely
to be useful in a large number of different types of audio applications.
 It's not that I don't like the idea of the FFT processing.  I spent a few
years of my career working on this type of stuff with SVP and AudioSculpt at
IRCAM.  But they are more specialized and I'd like to consider the
alternatives before creating a new specialized AudioNode.  But, that's just
my opinion and we can keep the debate open if you like :)

Cheers,
Chris
Received on Monday, 18 October 2010 19:41:06 UTC