Re: Proposal for fixing race conditions from Jer Noble on 2013-06-26 (public-audio@w3.org from April to June 2013)

From: Jer Noble <jer.noble@apple.com>
Date: Wed, 26 Jun 2013 13:16:07 -0700
To: public-audio@w3.org
Cc: Robert O'Callahan <robert@ocallahan.org>, Chris Rogers <crogers@google.com>
Message-id: <D0DB7B74-2CED-4752-8D90-D4D38B1115C0@apple.com>
On Jun 24, 2013, at 5:40 PM, Robert O'Callahan <robert@ocallahan.org> wrote:

> So, I propose sticking with #1 for this method and enable authors to avoid performance issues by introducing a new method:
> setValueCurveAtTime(AudioBuffer values, double startTime, double duration)
> This would use the sample data in the first channel as the curve values. This would let you use the same data repeatedly without copying.
> 
>  
> 3) WaveShaperNode.curve
> I propose copying the data when the attribute is assigned to.
> 
> Similar to the AudioParam curves, it would also be inefficient to copy here, although the problem isn't quite as bad (compared with the grain envelope case).
> 
> I would address the performance issues here in a similar way as above: allow "curve" to be a "Float32Array or AudioBuffer”.

I'm okay with both of these.  However:

> 4) AudioBuffer.getChannelData(unsigned long channel)
> This is the tricky one. Proposal:
> Define a spec-level operation "acquire AudioBuffer contents" which delivers the current contents of the AudioBuffer to whatever operation needs them, replaces the AudioBuffer's internal Float32Arrays with new Float32Array objects containing copies of the data, and neuters the previous Float32Arrays.
> [IMPORTANT: Implementations can and should optimize this operation so that a) multiple "acquire contents" operations on the same AudioBuffer (with no intervening calls to getChannelData) return the same shared data; b) replacing the internal Float32Arrays with new Float32Arrays happens lazily at the next getChannelData (if any); and thus c) no data copying actually happens during an "acquire contents" operation. Let me know if this is unclear; it's terrifically important.]
> Then:
> -- Every assignment to AudioBufferSourceNode.buffer "acquires the contents" of that buffer and the result is what gets used by the AudioBufferSourceNode.
> 
> One of the most important features of the Web Audio API is to be able to efficiently trigger many overlapping short sounds.  This is very common in games, interactive applications, and musical applications.  In many cases, these individual "sound instances" are based on the same underlying audio sample data.  So it's very important to not require copying PCM data.  These PCM buffers can also be quite large (multiple megabytes), and portions of a larger buffer can be scheduled as "sound grains" (or "audio sprites").  So I can't agree to requiring any kind of copying here.
> 
> Please re-read what I wrote above. The "acquires the contents" operation does not need to actually copy. In Gecko it does not copy.
> 
> Let me try in explain what we do in more detail. In Gecko, an AudioBuffer is in one of two internal states:
> 1) Its data is stored in JS Float32Arrays, one per channel.
> 2) Its data is stored in immutable data buffers, one per channel.
> An AudioBuffer starts in state 1.
> When the "acquire AudioBuffer contents" operation happens on an AudioBuffer in state 1, we change the AudioBuffer to state 2. This detaches the data buffers from the Float32Arrays (neutering those Float32Arrays), attaches them into a refcounted, thread-shareable object which is the new state of the AudioBuffer, and the refcounted object is passed to the audio thread to be used there.
> When the "acquire AudioBuffer contents" operation happens on an AudioBuffer in state 2, we just addref the refcounted object and pass it to the audio thread.
> When getChannelData() is called on an AudioBuffer in state 1, there's nothing to do, we just return the internal Float32Array.
> When getChannelData() is called on an AudioBuffer in state 2, we change the AudioBuffer to state 1. We create new Float32Arrays and copy from state 2's immutable data buffers into the Float32Arrays. This is the only case where data copying is required --- when an application calls getChannelData() *after* the AudioBuffer has already been used by some AudioNode. (If getChannelData() is called after all AudioNodes have stopped using the AudioBuffer, we could still avoid copying in that case, with a bit more work.)

Any option which will cause ArrayBufferViews to be neutered will cause performance regressions in unrelated ArrayBufferView code paths.  Neutering (ala workers) destroys some of the optimizations which our JavaScript interpreter can make for regular buffer access: once any ArrayBuffer is neutered, the interpreter must begin to perform neuter checks before any accessing any ArrayBuffer.  Given that the primary clients of the WebAudio API will be browser games, and given that game engines will (through asm.js or otherwise) make heavy use of ArrayBufferViews, this could be a very significant performance regression. So I would not support any spec language which would require neutering during the normal course of playback.  Apart from the performance implications, this behavior is also extremely complicated and would be an ongoing drag on engineering resources.

That said, I'd like to make an alternative proposal that will have acceptable performance costs, simplify implementations, and be understandable to web developers.  It all follows from one change:

* AudioBuffers are immutable.

Once data has been added to an AudioBuffer, it can no longer be modified.  Data passed into the AudioBuffer will be copied internally.  getChannelData() will return a copy of the AudioBuffer's contents.  Currently, developers are filling AudioBuffers with synthesized audio by calling getChannelData().set(), so we will need to add a AudioContext.createBuffer() which takes sampleRate and numberOfChannels arguments. (I believe such a change was being considered anyway.)

This implies some additional memory costs.  Authors who inspect the contents of AudioBuffers will see increased memory usage.  Implementors may be able to mitigate those costs through clever copy-on-write tricks such that only when the copied contents are modified do they increase memory usage, but that is an implementation detail.

The spec language is clear and easy: "Modifications to the Float32Array argument will not effect the ArrayBuffer after it is created." and "Modifications to the Float32Array returned from getChannelData() will not effect the ArrayBuffer."  You could also explicitly state that the data is copied in each case, but leaving the details up to the implementer may allow them to do memory optimizations that achieve the same desired outcome without actually copying data.

-Jer
Received on Wednesday, 26 June 2013 20:16:41 UTC