- From: Marcus Geelnard <mage@opera.com>
- Date: Tue, 23 Jul 2013 23:52:59 +0200
- To: Chris Wilson <cwilso@google.com>
- Cc: Ehsan Akhgari <ehsan.akhgari@gmail.com>, "Robert O'Callahan" <robert@ocallahan.org>, Jer Noble <jer.noble@apple.com>, Russell McClellan <russell@motu.com>, WG <public-audio@w3.org>
- Message-ID: <CAL8YEv6w3SQaowdfc=4rrpPBHha-SqqUUnpo-u+Z4TmNtCB+aA@mail.gmail.com>
On Tue, Jul 23, 2013 at 10:10 PM, Chris Wilson <cwilso@google.com> wrote: > On Tue, Jul 23, 2013 at 11:00 AM, Marcus Geelnard <mage@opera.com> wrote: > >> If you're talking about pre-rendering sound into an AudioBuffer (in a way >> that can't be done using an OfflineAudioContext), I doubt that memcpy will >> do much harm. Again (if this is the case), could you please provide an >> exanple? >> > > OK. I want to load an audio file, perform some custom analysis on it > (e.g. determine average volume), perform some custom (offline) processing > on the buffer based on that analysis (e.g. soft limiting), and then play > the resulting buffer. > > If I understand it, under ROC's original proposal, this would result in > the the entire buffer being copied one extra time (other than the initial > AudioBuffer creation by decodeAudioData), under Jer's recent proposal I > would have to copy it twice. "I doubt that memcpy will do much harm" is a > bit of an odd statement in favor of - as you yourself said, I don't think > that "it's usually not a problem" is a strong enough argument. I don't see > the inherent raciness as a shortcoming we have to paper over; this isn't a > design flaw, it's a memory-efficient design. The audio system should have > efficient access to audio buffers, and it needs to function in a decoupled > way in order to provide glitch-free audio when at all possible. > > Ok, so here's my view of it: For audio processing there are very few situations where memcpy is a performance bottleneck. 1) The speed / time issue (I think we agree here already, but here are some raw numbers anyway if anyone is still in doubt). I did a quick test on my desktop computer, and I can memcpy 60 seconds of 48Ksamples/s stereo audio (float32) in 1 ms. If memory serves me right my Tegra 2 phone will take about 30 times that, so less than 50 ms in any event. I strongly doubt that it would be even noticeable for your use case. For reference, a simple normalization loop (i.e. find max amplitude + scale all samples with 1 / maxAmplitude) over 60 seconds of sound takes >500 ms on my desktop computer in Chromium. In other words 2x memcpy of the buffer would amount to < 0.4% of the total processing time, and that's for a very trivial operation (for more complex processing, the memcpy would be even less noticeable). 2) The doubling of the memory issue. I'm trying to come up with a worst case scenario here, but I think that the most reasonable situation is that you do something like this: a) Load & decode sound into an audio buffer (1x memory). b) Copy the audio buffer into float32 typed arrays (2x memory). c) Process the data (still 2x memory). d) Copy the arrays back to the audio buffer (still 2x memory). e) Drop the reference to the typed arrays -> GC (back to 1x memory). ...and then you repeat this process for every sound you wish to process. In other words, you'll likely only have at most 1x memory + the memory of the last/current buffer being processed, which should amount to *less* than 2x the total memory used for audio buffers. In "steady state" (after processing is done), you'll be back to 1x the memory, so it's a temporary memory peak. True, this *might* be a problem, but if you're creating an app that even comes close to using 50% of your available memory for audio buffers alone, I'd say that you're in deep trouble anyway (I think it will be very hard to make such an app work cross-platform). In fact, here's another thought: With Jer's proposal an implementation is no longer forced to using flat Float32 arrays internally, meaning that it would be possible for an implementation to use various forms of compression techniques for the AudioBuffers. For instance, on a low memory device you could use 16-bit integers instead of 32-bit floats for the audio buffers, which would *save* memory compared to the current design (which prevents these kinds of memory optimizations). I'm not saying that the latter is the way to go, but my experience with development for mobile devices is that it's quite nice to have the option to sacrifice quality for performance at times. E.g. using 16-bit graphics instead of 32-bit graphics can be a valid choice, if it gives you 2x the rendering performance and saves you 50% of the memory, especially on a device that can't display all the 16M colors anyway. The same could go for audio. Oh, and with regards to the "it's usually not a problem" analogy - point taken. However, from my point of view we're talking about a few corner cases where we'll see a slight (temporary) increase in memory consumption (in the normal use case, there'll be no difference) vs breaking the way the Web works (as has been explained in various ways by different people on this list). I guess this is what is dividing the group into two camps right now. Personally, I'd much rather go with the potential memory increase than breaking typed arrays (i.e. I'm in the "let's keep the Web webby" camp). /Marcus
Received on Tuesday, 23 July 2013 21:53:28 UTC