Re: Integer PCM sample formats to Web Audio API?

On Tue, Jan 14, 2014 at 1:28 PM, K. Gadd <kg@luminance.org> wrote:

> On Fri, Jan 10, 2014 at 12:53 AM, Marcus Geelnard <mage@opera.com> wrote:
>
>> I agree with Chris here. In general, an implementation has a much better
>> chance of making an informed performance trade off decision than a Web
>> developer. In most situations I think that int16 would save memory, but add
>> some extra CPU overhead. In other words, if a client has loads and loads of
>> memory, it could for instance decide to take the memory hit and convert to
>> Float32 internally in order to lower the CPU load, either up front or
>> lazily.
>>
>> If a Web page was to make the trade off decision, it would have to know a
>> lot more about the device it's running on (such as CPU performance,
>> available memory, etc), and since that is not the case, I fear that Web
>> devs would resort to things like UA sniffing etc to make distinctions such
>> as "desktop" vs "fast-tablet" vs "slow-phone", etc.
>
>
> We're not talking about web developers that know nothing about
> performance, though. This is specifically a use case where developers need
> control over memory representation in order to achieve something
> approximating performance parity with native audio applications. int16
> samples aren't ever going to be the default (right???) so it's not as if
> the average web developer's going to shoot themselves in the foot without
> realizing it - if they opt into int16 in a way that harms them, that is
> unfortunate, but it doesn't mean that the actual use cases for int16 aren't
> justified.
>

You're trying to lump all developers into one bucket of knowledge.  Many
developers know what they are doing, and they could possibly make an
informed decision here.  I'd also point out that their informed decision
might have its parameters changed on different hardware, or a different
date - and they may well not have enough data to make an informed decision,
either.  ("How much does the user want to trade off RAM use for CPU load?)
 That's what Marcus was pointing out.  Additionally, history has shown that
it's quite possible to THINK you're doing the right thing, but cause damage
in the future accidentally.  (E.g. UA sniffing.)  All I was advocating here
is caution in providing low-level controls to flip around the memory/CPU
tradeoffs, without carefully thinking through how all levels of developers
might touch those controls.

If int16 buffers don't offer something approximating actual guarantees, you
> haven't fixed anything - that native port will still have to assume the
> worst (i.e. using 2x as much memory) and be rewritten to work with a tiny
> address space, making your int16 buffer optimization nearly meaningless -
> sure, the mixer might be slightly faster/slower and the process's resident
> memory use will be lower, but it won't enable any new use cases and certain
> ports will still be out of the question.
>

What's a "guarantee"?  Even if we mandated, with a MUST, that
implementations MUST use native 16-bit storage when requested,
implementations might choose not to do that as a performance/battery
optimization.  They wouldn't be conforming, but they would work.


> This is a slightly different issue, namely "What's the lowest quality I
>> can accept for this asset?". I can see a value in giving the Web dev the
>> possibility to indicate that a given asset can use a low quality internal
>> representation (the exact syntax for this has to be worked out, of course).
>> The situation is somewhat similar to how 3D APIs allow developers to use
>> compressed textures when a slight quality degradation can be accepted. For
>> audio, I think that sounds such as noisy or muddy sound effects could
>> definitely use a lower quality internal representation in many situations.
>> The same could go for emulators that mainly use 8/16-bit low-sample-rate
>> sounds.
>>
>
> 22khz isn't 'low quality internal representation'; if the signal is
> actually 22khz I don't know why you'd want to store it at higher
> resolution. Lots of actual signals are at frequencies other than 48khz for
> reasons like reproducing the sound of particular hardware or going for a
> certain effect.
>

I think that's what Marcus said.


> (Also, isn't the mixer sampling rate for web audio unspecified - i.e. it
> could be 48khz OR 44khz? given this, it makes sense to let users provide
> buffers at their actual sampling rate and be sure they will be stored that
> way.) The idea of handing Web Audio a 22khz buffer, the implementation
> upsampling it to 48khz, and then sampling it back down to 22khz for 22khz
> playback is... unfortunate.
>

The AudioContext's sampleRate is not set to a defined number, but in
practice the sampleRate is set to the audio output sample rate - that is,
the AudioDestinationNode's native rate - since that's where the clock is
coming from.  The point is that the entire audio context is run in a single
rate, to minimize resampling.

Having such an option in the API gives the implementation an opportunity to
>> save memory when memory is scarce, but it's not necessarily forced to do so.
>>
>
> The whole point is to force the implementation to save memory. An
> application that runs out of memory 80% of the time is not appreciably
> better than one that does so 100% of the time - end users will consider
> both unusable.
>

Given all the other factors that may change memory usage in the web
platform, I'm not sure why this one feature will solve that problem.  Or
even come close.  Again, I'm not saying I see no reason to look closely at
this; I'm just saying that I don't think this is as big a slam dunk as you
appear to, and I think there are notable situations when it is better to
NOT store that data in int16, and there will be


> On this whole subject it is important to realize that when talking about
> developers porting games and multimedia software from other native
> platforms, it is usually not wise to assume they are idiots that will shoot
> themselves in the foot.
>

That was not the intent, and I was certainly not making that assumption.
 However, those aren't the only developers that would have this API
available - and I would venture some of them would choose to make this
decision without understanding how it may affect them on other devices or
browsers, now and in the future.  Mostly because that's pretty much
impossible to know.

Yes, developers make mistakes, and they ship broken software that relies on
> bugs in browser implementations - I can understand the reluctance to give
> developers more ways to make mistakes.
>

It's not "reluctance to give developers more ways to make mistakes" at all.
 It's "caution in exposing low-level platform implementation details unless
you are absolutely, positively certain it can be made a net win overall."
 Every low-level implementation detail that's exposed makes it that much
harder for the web platform to scale across devices, and puts more onus in
the developer to own that scalability; that begs for caution.


> In these scenarios, we have working applications that do interesting
> things on native platforms, and if you significantly undermine the Web
> platform's ability to deliver parity in these scenarios, you're not
> protecting native app developers from anything, all you're doing is keeping
> them off the Web and stuck in walled garden App Stores.
>

All I'm saying is "parity does not mean do it the same way," and pointing
out that the Web platform is supposed to scale across different hardware
and devices better, I think, that previous platforms have done.

Again, I would point out that making a change that would allow developers
to force the integer storage of buffers would have negative side effects,
and all I'm cautioning is those should be carefully examined and weighed.
 I would postulate a set of developers would say "well of course, my data
is 16-bit 22kHz, of course I want to force the data to be stored that way
to save memory!" without considering that by doing so, they are going to be
burning battery life (aka CPU time).  That's not always the right tradeoff.

Received on Tuesday, 14 January 2014 22:48:53 UTC