Re: Integer PCM sample formats to Web Audio API?

(Here's yet another too-long message to this thread, but I hope that it 
may clear some things up)

I feel that there are a few misunderstandings here. It seems to me that 
we agree on principles (I think most of us are in favor of saving memory 
when possible, for instance), it's merely a question of the technical 
and formal details.

First, a short background (as I understand it) to why we have Float32 in 
the Web Audio API, and why it looks like it does:

I think that most would agree that the most important reason for having 
Float32 in the mixer and in the audio processing nodes is to make the 
graph behave correctly for complex configurations, and to guarantee high 
quality audio processing (alternate HDR sample formats such as 24-bit 
integers would just be silly on a Web platform).

Another reason for the Float32 internal format of audio samples becomes 
obvious when you realize that the Web Audio API was originally designed 
around the concept of mutable AudioBuffers, and AudioBuffers are used in 
several of places, not just for AudioBufferSourceNodes.

In fact, the original implementation shared the memory between the audio 
engine and the JS heap (I believe this is still the case in Blink 
today), meaning that it really *had* to use Float32 internally (since 
that's what's exposed in JS land). And if memory serves me correctly, 
one of the reasons for that design was to save memory, since a single 
copy of the audio buffer costs less than having both an internal copy 
and a copy on the JS heap.

I personally think this is an unfortunate design decision, but it's the 
reality right now, and I don't think we'll want to depart too much from 
that paradigm for the first version of the API. With Mozillas 
neutering-based solution for AudioBuffers, where the data is no longer 
shared in the same way as it was before, we've started moving in the 
right direction IMO, and perhaps we can drive the trend towards 
immutable AudioBuffers with time, and thus enabling more memory, 
performance and quality optimizations (but that's mostly my personal 
opinion).

So, when discussing Float32 vs Int16 etc, please keep in mind the use 
cases where an AudioBuffer is used for accessing and possibly also 
modifying audio data by using the getChannelData method on the 
AudioBuffer, such as:

* ScriptProcessorNode / AudioProcessingEvent
* createBuffer()
* OfflineAudioContext

There has already been a suggestion brought forward by ROC (i.e. allow 
the use of Int16 internally), that should solve the most urgent memory 
issues. If that suggestion does not solve the problems at hand, please 
provide more information.

The debate now seems to revolve mostly about the level of control to 
expose in the APIs, and there seems to be some fear that a UA might make 
bad decisions about when to use Float32 internally or not.

Let me put it this way: If there is an opportunity to use Int16 
internally, without breaking spec conformance, and without hitting a 
tremendous performance penalty, the UA *will* use Int16 internally 
(saving memory, especially on mobile devices, is almost always a huge 
win for UAs).

If there are significant pros/cons in terms of performance or quality 
with either internal format (Int16 or Float32), I think a reasonable 
solution would be to add hints somewhere in the API that would instruct 
the UA to optimize for a certain use case, such as Quality, Memory or 
Speed (similar to the GL_FASTEST/GL_NICEST/GL_DONT_CARE hints [1] or the 
"usage" argument of glBufferData [2] in OpenGL). That should provide 
sufficient control for the Web developer, IMO.

Still, it could be a bit tricky to get a hint interface right and 
meaningful, so I would like to see clearly demonstrated cases (not just 
hypothetical ones) before going down that route - it's quite possible 
that it's superfluous. In the mean time, we should be able to get quite 
far without it.

/Marcus


[1] http://www.opengl.org/sdk/docs/man/xhtml/glHint.xml
[2] http://www.opengl.org/sdk/docs/man/xhtml/glBufferData.xml

-- 
Marcus Geelnard
Technical Lead, Mobile Infrastructure
Opera Software

Received on Friday, 17 January 2014 10:24:58 UTC