Audio-ISSUE-74 (RealtimeAnalyserNode): RealtimeAnalyserNode design [Web Audio API]

Audio-ISSUE-74 (RealtimeAnalyserNode): RealtimeAnalyserNode design [Web Audio API]

http://www.w3.org/2011/audio/track/issues/74

Raised by: Philip Jägenstedt
On product: Web Audio API

The RealtimeAnalyserNode has so many problems that we will put them into a single issue.

The use case appears to be probing/polling the signal for visualization, where it does not matter if all of the signal is available. It also doesn't modify the output, so it need not delay the processing at all, like JavaScriptAudioNode would.

The problems identified with the current spec are:

1. It is undefined how multi-channel input maps to time/frequency data, which are both single arrays.

2. The layout/order of the frequency bins is undefined. Are the negative frequencies included?

3. It is undefined what happens if the array to the getters has more elements than frequencyBinCount.

4. smoothingTimeConstant is defined only as "A value from 0 -> 1 where 0 represents no time averaging with the last analysis frame." How does it affect time/frequency domain data? What is an analysis frame and which is the last?

5. If frequencyBinCount == fftSize / 2, why is it exposed at all?

6. minDecibels/maxDecibels are undefined. Do minDecibels/maxDecibels control the output of getByteFrequencyData, or do they describe it? Are the parameters only used for getByteFrequencyData()? If so, why are they not arguments to that method? How does the Uint8 range (0-255) map to the decibel range (minDecibels-maxDecibels)?

7. What happens if fftSize is set to something that is not a power of two? Are there any limits? Are 1 and 2^32 both valid values?

8. If the fftSize is initially set to 1, then changed to 2^32, what should the getters do? To not restrict this requires unbounded buffering to handle an arbitrarily large fftSize.

9. It's not at all clear why there are Uint8Array getters, instead of simply a frequency domain and time domain getter, both as Float32Array.

For the use case we're aware of, this can be simplified greatly. We'd prefer an interface that just allows probing the most recent time domain data as an AudioBuffer and leave it up to the Web developer to perform the FFT by other means. A fast, generic FFT function can be very useful not only for visualization, but also for synthesis, filters etc. In the absence of a native FFT implementation (which could be part of another specification - perhaps add it to the Math object), a custom JavaScript FFT implementation will most likely suffice for most applications.

For example:

    interface AudioProbe : AudioNode {
        // get the most recent data available.
        AudioBuffer getData();
    }

    // in AudioContext, the size must be given up-front and cannot change
    AudioProbe createAudioProbe(in unsigned long bufferSize);

Depending on how https://www.w3.org/2011/audio/track/issues/28 is resolved, we could simply have an attribute "AudioBuffer data" that is guaranteed to be stable while the script is executing, to avoid the use of a getter function altogether.

Received on Wednesday, 16 May 2012 11:57:45 UTC