- From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
- Date: Sat, 27 Jul 2013 17:29:33 +0300
- To: Chris Wilson <cwilso@google.com>
- Cc: Marcus Geelnard <mage@opera.com>, Ehsan Akhgari <ehsan.akhgari@gmail.com>, "Robert O'Callahan" <robert@ocallahan.org>, Jer Noble <jer.noble@apple.com>, Russell McClellan <russell@motu.com>, WG <public-audio@w3.org>
- Message-ID: <CAJhzemVMOkddf0pbN0L9Bnxt46PkysKK+-y4h4RKLk70tzGh9g@mail.gmail.com>
On Fri, Jul 26, 2013 at 7:23 PM, Chris Wilson <cwilso@google.com> wrote: > On Wed, Jul 24, 2013 at 4:06 AM, Jussi Kalliokoski < > jussi.kalliokoski@gmail.com> wrote: > >> On Tue, Jul 23, 2013 at 11:10 PM, Chris Wilson <cwilso@google.com> wrote: >> >>> OK. I want to load an audio file, perform some custom analysis on it >>> (e.g. determine average volume), perform some custom (offline) processing >>> on the buffer based on that analysis (e.g. soft limiting), and then play >>> the resulting buffer. >>> >> This is a symptom of another problem with the API. In this scenario your >> biggest problem with the API is far from the copy happening here, instead >> it is that the method for decoding audio has the wrong input and output for >> most cases. What the decodeAudioData assumes currently is that what you >> have is a binary buffer containing the encoded audio data and you want a >> high-level construct representing the audio data (an AudioBuffer) out of >> it. Your case (and a common case anyway), however, is that you have a URL >> to an audio resource and you want a list of Float32Arrays out. Why does >> decodeAudioData (async) return an AudioBuffer in the first place? >> > > Hmm. Well, the channels being handled as an array of Float32Arrays would > be less structurally obvious. Since the it is resampled to the > AudioContext rate anyway, the meta data is less interesting - although I'd > ideally like our decoding to have more metadata expressing the internals, > rather than less (e.g. original sample rate, any tempo tags, etc.). > Let's keep our use cases in check here. decodeAudioData is completely unsuitable for any scenario where you'd do anything with the metadata. In all of the scenarios I can think of where you'd want to use the metadata you'd also want a completely more low-level construct, e.g. a streaming decoder. decodeAudioData is designed for one-shot samples, convolution kernels and oscillator waveforms, but then again for these use cases, it's a too low-level construct (because you have to do the XHR manually). If you're worried about performance and memory usage, decodeAudioData is usable only for buffers of few seconds in length, and resources like that hardly ever have any relevant metadata in them. Admittedly, returning a Float32Array from the decode operation isn't ideal for the use cases where you for example want to assign the data to an array buffer, but it allows the use case of modifying the data before creating an AudioBuffer out of it without the cost of a memcpy and double memory. On another note, I'd like to point out that the discussion has advanced to the point where we have provided suggestions that for most cases rival and potentially outperform in some situations, both in terms of CPU and memory usage, the current designs without imposing race conditions, but you still continue to take the stance that we should just define extensions to the other parts of the web platform just to allow racy things in our API? IMO a lot of the proposals would improve the usability of the API as well, for example Marcus' suggestion to use arrays of Float32Arrays directly for the AudioProcessingEvent. I can understand if your argument is backwards compatibility, but that's not the argument you seem to be making. Cheers, Jussi
Received on Saturday, 27 July 2013 14:30:00 UTC