Re: Integer PCM sample formats to Web Audio API? from Chris Wilson on 2014-01-09 (public-audio@w3.org from January to March 2014)

From: Chris Wilson <cwilso@google.com>
Date: Thu, 9 Jan 2014 11:15:52 -0800
To: Jukka Jylänki <jujjyl@gmail.com>
Cc: Marcus Geelnard <mage@opera.com>, "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CAJK2wqU8U39qLtWacBi39aKXr20exKSubsUw33SQGzOA+Os8Dw@mail.gmail.com>
On Thu, Jan 9, 2014 at 4:27 AM, Jukka Jylänki <jujjyl@gmail.com> wrote:

> If there was an added AudioBuffer-from-int16 array form and the conversion
> to float was done under the hood on demand if needed, I would hope the
> following features to go along with it:
>
  - Explicit performance guarantees in the spec for which function calls
> and situations will cause an expansion of the int16 array to float32.
>

That's easy.  The int16 array would definitely get expanded if you called
getChannelData(); it would also be expanded as it's played by the pipeline
(but obviously, not maintained that way, so you wouldn't see the memory
consumption of an entire buffer being expanded).


>   - A way to query whether a given buffer is int16 or float32, to allow
> code to assert() that it is getting what it desires. Otherwise debugging
> can be a pain if the only way is for the programmer to measure this in
> indirect ways via looking at memory consumption in a system process manager.
>

I would NOT want to do this.  This should be under the hood, and some
systems (e.g. desktops with lots of memory) may choose to always expand.

   - If the support for int16 is specced in to be optionally supportable (I
> hope this will not be the case!), a way to ask the implementation if it
> does support int16 format.
>

I don't think you'd need this.  It's going to end up being a performance
tradeoff - memory for CPU - that's likely better made by the platform
implementation than the developer.  If we did want this, I'd expect this is
a different decodeAudioData-style call.


>   - A function call that allows extracting the AudioBuffer data back as an
> int16 array.
>

Why? If we're going to do this, we would likely have to do other formats as
well.

I'll be blunt - the Web Audio API is a floating-point API.  I understand
the need to optimize by storing integer buffer data, particularly in
constrained memory situations (and most of mobile falls in this category
:), but I do not want to open the Pandora's box of trying to rework the API
to be an integer-based API.


> Weak unspecified performance guarantees with "you'll get the fast path if
> stars align" are a sign of a badly designed API (looking at you OpenGL).
>

If that were true, I would agree with you.  All we've been talking about
here is the ability to keep AudioBuffers internally as the original 16-bit
integer form. That's not a fast path - the signal path to play those
buffers is still going to be floating point, and the conversion is still
going to happen; in fact, it's going to happen more frequently if anything
(e.g. if you play a given buffer more than once).


> I understand if compromises may need to be made in the case when the
> initial design was already done and only later it was realized that new
> features needed to be added in. In this case, I'd be willing to accept if
> strongly-typed AudioBuffers were too difficult to spec in anymore and that
> a conversion/lifting could happen on-demand if needed, but I'm not willing
> to accept a minimal "use this one newly added function call, and cross your
> hands to hope that you will get what you want" patchwork. I don't want to
> see a case where everything seems to initially work, and then three months
> after you add a seemingly unrelated functionality that does something
> "processing-like" (dynamically adjusting volume by distance to audio
> receiver, add 3D positional audio source support, etc.) and after that
> newly added call you _silently_ start having float32 buffers everywhere,
> but only on that certain OS and that certain browser with that single audio
> card that nobody else has.
>

ANY processing is going to convert the data as it is played.  But aside
from that, the original buffer data isn't really messed with unless the
user calls getChannelData on it.

The feature needs to be well-defined enough that it's actually debuggable.
> It can't be an "all bets are off" type of thing, but there needs to be a
> way to confirm that you are getting what you expect, or if you cannot get
> what you expect, you need to be able to have a way to be explicitly aware
> of the case. Silent failure is not an option.
>

I'm not sure I agree with that.  Surely you'd rather have an implementation
play something and use extra memory, rather than just choke?


> Should I go ahead and add a bug about adding int16 support to
> https://github.com/WebAudio/web-audio-api/issues ? I understand that is
> the tracker used for spec bugs and feature requests?
>

Yes.
Received on Thursday, 9 January 2014 19:16:20 UTC