- From: Marcus Geelnard <mage@opera.com>
- Date: Sat, 18 Jan 2014 12:29:06 +0100
- To: Raymond Toy <rtoy@google.com>
- Cc: Chris Wilson <cwilso@google.com>, Katelyn Gadd <kg@luminance.org>, "public-audio@w3.org" <public-audio@w3.org>
Hi Ray, 2014/1/17 Raymond Toy <rtoy@google.com>: > Thanks Marcus and Chris. I now see what you're getting at. Yes, this is an > option. Somehow, though, I think that if you need low memory, you also have > low CPU, so you lose no matter what. :-) > Exactly, which is why I think it may be necessary to lessen the audio quality requirements in case we don't resample, in order to enable low CPU loads. As long as we spec that behavior, I think it would be fine - as long as the user is given the option to select whether to go with the high quality or the low memory code path. > I think Blink's interpolator for an AudioBufferSourceNode is just a linear > interpolator. There will be some artifacts if you have to upsample (or > downsample) too much. This will have to be fixed if we go this way. > If we skip resampling entirely, for all samples, yes, we'd have to require a higher quality interpolator, which would inevitably mean higher CPU loads. Otherwise (i.e. if resampling could be done selectively), I think that it would not be THAT much of a deal. I believe that most consumer-level audio systems in use today don't do off line resampling (instead they typically offer a selection between linear, cubic or N:th order sinc interpolation, for instance). If we're talking sound effects in a game, I believe a fairly simple interpolator would suffice (as long as the author is aware of the implications, and can make an informed decision). /Marcus > Ray > > > On Fri, Jan 17, 2014 at 2:42 PM, Marcus Geelnard <mage@opera.com> wrote: >> >> The AudioBufferSourceNode already has the capability to play back the >> AudioBuffer at any sample rate - it would be transparent to the user. >> >> As far as I understand, the main reason for resampling up front is to >> lessen the requirements on the reconstruction filter/interpolator: by doing >> more work in decodeAudioData you can do less work in AudioBufferSourceNode, >> and still achieve good quality (at least for the typical use cases when the >> sample is played back at a pitch close to its original pitch). >> >> If we skip the resampling step, I think we have to choose between a more >> costly interpolator (eats CPU cycles) or slightly reduced audio quality >> (depending on what combination of resampler and interpolation algorithms are >> used). >> >> One option here could be to let the user decide on a per sample basis >> which matters the most: audio quality or memory footprint. >> >> /Marcus >> >> >> fredagen den 17:e januari 2014 skrev Chris Wilson <cwilso@google.com>: >> >>> The goal would be to not resample and store at 48kHz, but still be able >>> to play back with high quality in that case. As Marcus said, that would be >>> harder. (Although is upsampling as costly as downsampling in this case?) >>> >>> On Jan 17, 2014 1:46 PM, "Raymond Toy" <rtoy@google.com> wrote: >>>> >>>> >>>> >>>> >>>> On Fri, Jan 17, 2014 at 1:31 PM, Marcus Geelnard <mage@opera.com> wrote: >>>>> >>>>> 2014/1/17, Chris Wilson <cwilso@google.com>: >>>>> > On Fri, Jan 17, 2014 at 2:24 AM, Marcus Geelnard <mage@opera.com> >>>>> > wrote: >>>>> > >>>>> >> So, when discussing Float32 vs Int16 etc, please keep in mind the >>>>> >> use >>>>> >> cases where an AudioBuffer is used for accessing and possibly also >>>>> >> modifying audio data by using the getChannelData method on the >>>>> >> AudioBuffer, >>>>> >> such as: >>>>> >> >>>>> >> * ScriptProcessorNode / AudioProcessingEvent >>>>> >> >>>>> > >>>>> > I believe there's already a suggestion on the table to replace >>>>> > AudioBuffer >>>>> > there with Float32Array. >>>>> >>>>> I'm all for that. I think it would be natural to consider that option >>>>> when specing the new worker-based script processor. >>>>> >>>>> > >>>>> > There has already been a suggestion brought forward by ROC (i.e. >>>>> > allow the >>>>> >> use of Int16 internally), that should solve the most urgent memory >>>>> >> issues. >>>>> >> If that suggestion does not solve the problems at hand, please >>>>> >> provide >>>>> >> more >>>>> >> information. >>>>> >> >>>>> > >>>>> > +1. I'd still like to better understand the conversion impact. >>>>> > >>>>> >>>>> If I can find the time I'll try and make some kind of benchmark of a >>>>> simple int16->float32 format converter. >>>>> >>>>> > The open questions, to me, are 1) how does the data get EXPOSED then >>>>> > (i.e. >>>>> > does getChannelData still return a float32array, and force >>>>> > conversion), >>>>> >>>>> I would prefer to keep it as Float32, at least for now. I see little >>>>> value in handing over integers to any kind of JS processing. The >>>>> implication would probably be that if you use getChannelData, you'll >>>>> force a conversion of the internal format to Float32. >>>>> >>>>> > 2) >>>>> > if it is exposed in int16 or similar, how far down that rabbit hole >>>>> > do we >>>>> > go (int8, int24?, int32), and >>>>> >>>>> IMO the added value of such an addition would not justify the API >>>>> complexity cost, plus it could easily be a slippery slope. >>>>> >>>>> > 3) I will point out again that the 2x bloat >>>>> > from converting to int16 to float32 is potentially much less of a >>>>> > problem >>>>> > than the sample rate resampling (loading a 22kHz sample into a 96kHz >>>>> > audio >>>>> > context would cause a >4x bloat). >>>>> > >>>>> >>>>> +1 It may be slightly trickier to drop the resampling step though, >>>>> since it could come with a quality penalty. I suggest that we give >>>>> that issue some attention. >>>>> >>>>> Do we want to make it possible to opt out from the automatic resampling >>>>> step? >>>> >>>> >>>> What would this mean? Say the audio context is 48 kHz and you have a >>>> 22.05 kHz audio sample. So you don't want the sample to be resampled >>>> automatically to 48 kHz? Then what happens to the audio when you connect to >>>> a bunch of nodes? >>>> >>>> Ray >>>> >>>>> >>>>> >>>>> /Marcus >>>>> >>>> >
Received on Saturday, 18 January 2014 11:29:34 UTC