General Notes on Web Audio API from TPAC from Alistair MacDonald on 2011-11-28 (public-audio@w3.org from October to December 2011)

From: Alistair MacDonald <al@signedon.com>
Date: Mon, 28 Nov 2011 14:39:59 -0500
To: public-audio@w3.org
Message-ID: <CAJX8r2=S24TosBmM7SVWhT+y+OAxQDax3855F_D7-JmrdKr2sA@mail.gmail.com>
Hi Group,

While working with the Web Audio API at TPAC I was taking some notes. I
sent these to Chris Rogers and we started discussing this a few weeks back.

Some of the technical discussion may be very useful / interesting to people
experimenting with the API, so I'm going to forward the thread to
public-audio here.

At the bottom of the email I have attached the original notes I sent to
Chris.

Comments/discussion always welcome,

Alistair

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - -

Hi Al, thanks for taking the time to really look into these things at the
level of detail you have.

On Thu, Nov 3, 2011 at 11:17 PM, Alistair MacDonald <al@signedon.com> wrote:

> Thanks, I'll update the demos over the next few days.
>
> I took some notes on my way back as I was putting these together...
>
>
> =============================================
> =  W E B  -  A U D I O  -  A P I  -  NOTES  =
> =============================================
>
>
>
>
> F E A T U R E  -  R E Q U E S T S
> =================================
>
>
> BufferSources That Do Not Die
> -----------------------------
> It is a very unfamiliar paradigm for the bufferSourceNode to die and become
> unusable after it has played once. It seems like it would be better for the
> audioContext to be able to play a bufferSource more than once. For example:
> if I call:
>

I'd like to talk more with you to try to convince you that this is much
more complex than it initially appears, especially when we start getting
into concepts such as internal loop-points (loop start time and loop end
time internal to the buffer), which are required for sample-based
synthesizers (similar to SoundFonts or DLS synths), and FM synthesis where
the .playbackRate attribute is modulated by another audio source (or
automated parameter), and can even go negative.

Although it may appear counter-intuitive at first, the idea of overlapping
independent instances is actually widespread in audio applications and is
crucial for achieving the common effects people expect in a game engine
(OpenAL-style), and also in software synthesis applications.  Having a
single audio source re-trigger is not ideal since there's a high potential
for audible glitches (clicks) when playback is re-triggered on the same
source.  I'm very concerned about making an API where it is easy for people
to misinterpret these subtle points and achieve less than desirable audible
results by making API calls which appear on the surface to make sense.
 I've essentially spent my whole career (>20 years) working on public APIs
used by other people for these types of applications, so I'm very tuned
into these kind of details, and want to make sure that "note" instances are
instantiated correctly.

I think when we document this behavior more fully, and explain the
concepts, then it will be quite easy for people to get what they want.
 Also, instead of silently failing we can be more explicit and actually
throw an exception with text describing the failure and how to fix it.  I
agree that silently failing is confusing.

Early feedback I'm getting from the "New Game" conference:
http://www.newgameconf.com/

is that people are finding the API pretty easy to work with.  Somebody even
created an OpenAL translation layer on top of the Web Audio API :)


>    // I would expect to hear 2 sounds, firing 1 second apart.
>
>    source.noteOn(0);
>    source.noteOn(1);
>
>    // If there was an overlap I would expect the audioContext to a) copy
>    // b) restart or c) drop the audio source depending on specified
> behavior
>    // specified as a parameter of the sourceNode in question.
>    //
>    // a) copy: the audioContext copies the note and both play overlapped
>    // b) restart: the audioContext rewinds the buffer pointer to it's 0
> index
>    // c) drop: the audioContext can copy the note; until x number of notes/voices
> are active
>
>
> In a similar fashion I would expect the audio of a sound to be restarted if
> I did the following:
>
>   setTimeout(function(){
>     context.currentTime = 0;
>   }, 1000); // Restart the audio clock one second in
>

The .currentTime attribute is read-only and is a global clock.  By design,
it's not a playback position on a media timeline (like in ProTools) which
can be seeked and re-started at zero, but is a higher-level concept
(building-block) which can be used to build such timeline clocks.  Once
again, it may seem strange at first, but I think that once I show you some
code snippets that you'll quickly see how you can build the kind of clock
value that is resettable that you show above.  This idea of a global clock
is a crucial feature which is available/necessary at the OS-level for use
by desktop audio/video applications like ProTools, Final Cut Pro, Logic
Audio, etc., so we need to have this on the web platform too.



>
>   source.noteOn( 0 );
>
> (Not sure how streaming would fit into this approach.)
>
>
> AudioContext Should be Able to Play Backwards
> ---------------------------------------------
> If you can save out faster than you can listen, then you should be able to
> do some very cool rewinding tricks. Could the whole graph be rendered with
> x samples into the future and x samples available in the past, into a play
> buffer right before the final destination that could be spun through time.
> Just throwing ideas down on this more than anything.
> ...
>

I'm thinking of similar things, where we could have a AudioHistoryNode (not
a very good name, but I'll use it for now) which could be inserted anywhere
in the signal chain.  It would buffer up the last "x" seconds worth of
audio, which would be accessible for direct manipulation for some pretty
cool effects.  One example of such an effect would be if you connected live
audio input (microphone or guitar, etc.), then you could schedule bits of
sound to play at exact times which live in this "history" buffer.  Some of
these bits of sound could even be played backwards, even though you're
working on a live signal, since you're tapping into the recent past.  You
could get some great live processing granular effects this way, similar to
what you can get on a Kurzweil K2500 or in Max/MSP.



>
>
>
> source.noteOn( time )
> ---------------------
>
> I would suggest that when you call the noteOn() method without an
> argument, it
> should default to immediate or '0'. EG: source.noteOn(0); and
> source.noteOn()
> should have the same behavior.
>

Yes, I agree.  We used to have this, but somebody on the WebKit team made
several API changes making things have "required" arguments.  We can change
this back.


>
>
> source.noteOn( time, callback? )
> --------------------------------
>
> It would be very useful to be able to pass a callback function to the
> noteOn
> method. The would be especially useful for creating loop-based
> applications.
> EG: I want an object in a game to flash every time a 4-bar loop starts
> over.
>

Yes, we need to think about a few callbacks such as loop-end reached, and
end-of-buffer reached (for non-looping).
We need to be a bit careful though about this loop-end callback, because if
the loop time is extremely small (say 0.25 milliseconds), then
this callback is going to be firing over and over again at an insanely high
rate which would kill the JS thread.  So, some kind of throttling would
have to be in place...


>
>
>
> audioContext.atTime( callback )
> -------------------------------
>

Great minds think alike :)
https://bugs.webkit.org/show_bug.cgi?id=70061



>
> Being able to set a callback based on the audioContext's internal clock
> would be
> very useful when visualizing music or timing events to voice-tracks. EG: I
> want
> to create a tutorial video, and change the text on the screen to present
> new
> information that relates to what is being spoken about in the audio track.
> It
> would be nice to reply on the audioContext's internal clock to time events
> rather than using JavaScript timers that are not in any way linked to the
> audio
> context.
>
>
>
> P O S S I B L E  -  B U G S
> ===========================
>
>
>
> AudioParam ...RampToValueAtTime( 0, 0 )
> ---------------------------------------
>
> Exponential ramps do not seem to work properly when starting using "zero"
> values.
> I would expect to hear the sound ramping in from 0 or out to 0, instead
> the
> transition appears to be immediate when it reaches the given time. Much
> like the
> behavior of the setValueAtTime() method.
>

This is a consequence of the math behind exponential curves.  You can
always have an exponentially increasing curve from
value1 -> value2

where:

1) value1 < value2         (since we're increasing)
2) value1 != 0                  (starting value can't be zero or we'll have
a "flat" curve)

This article show a little more what's going on:
http://en.wikipedia.org/wiki/Exponential_growth#Basic_formula

In that article "a" is the starting value and you can see that if it equals
zero, then it will be flat...

We need to include more references and technical background in the
specification about how the math works with exponential curves...


>
>
> BufferSource Connection Cost
> ----------------------------
>
> The cost of connecting buffer sources seems to increase with the lifetime
> of the
> audioContext. Every time you create a source and connect it to the
> audioContext
> it takes slightly longer. This is an interesting behavior I found when
> creating
> a test.
>

I'd be interested to get more details about your test.  There's no doubt
that more simultaneous sources (voices) playing will
increase the CPU load -- eventually to the point of overload which will
vary depending on how fast the machine is, how well
the code is optimized, etc.  That's pretty much how it works with audio
processing whether the code is running in a desktop-based
DAW, software synth, game engine...  It doesn't matter whether the code is
running directly in JavaScript, or in optimized assembly.
There's always going to be some limit.

But, I'm very interested in the details of your test, since I'd like to fix
any performance bugs which are not a consequence of the natural
performance hit I describe above.

By the way, Raymond Toy and I updated the BiquadFilterNode section with
some more details:
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#BiquadFilterNode-section

Very soon, Ray is going to be adding some detailed technical information
about the exact frequency response curves with
a graphic UI where the filter curves may be adjusted by tweaking the filter
params (like cutoff frequency, etc.).




________________________________________________________

Original notes I sent to Chris are below.
________________________________________________________





=============================================
=  W E B  -  A U D I O  -  A P I  -  NOTES  =
=============================================




F E A T U R E  -  R E Q U E S T S
=================================


BufferSources That Do Not Die
-----------------------------
It is a very unfamiliar paradigm for the bufferSourceNode to die and become
unusable after it has played once. It seems like it would be better for the
audioContext to be able to play a bufferSource more than once. For example:
if I call:

   // I would expect to hear 2 sounds, firing 1 second apart.

   source.noteOn(0);
   source.noteOn(1);

   // If there was an overlap I would expect the audioContext to a) copy
   // b) restart or c) drop the audio source depending on specified behavior
   // specified as a parameter of the sourceNode in question.
   //
   // a) copy: the audioContext copies the note and both play overlapped
   // b) restart: the audioContext rewinds the buffer pointer to it's 0
index
   // c) drop: the audioContext can copy the note; until x number of
notes/voices
are active


In a similar fashion I would expect the audio of a sound to be restarted if
I did the following:

  setTimeout(function(){
    context.currentTime = 0;
  }, 1000); // Restart the audio clock one second in

  source.noteOn( 0 );

(Not sure how streaming would fit into this approach.)


AudioContext Should be Able to Play Backwards
---------------------------------------------
If you can save out faster than you can listen, then you should be able to
do some very cool rewinding tricks. Could the whole graph be rendered with
x samples into the future and x samples available in the past, into a play
buffer right before the final destination that could be spun through time.
Just throwing ideas down on this more than anything.
...



source.noteOn( time )
---------------------

I would suggest that when you call the noteOn() method without an argument,
it
should default to immediate or '0'. EG: source.noteOn(0); and
source.noteOn()
should have the same behavior.



source.noteOn( time, callback? )
--------------------------------

It would be very useful to be able to pass a callback function to the noteOn
method. The would be especially useful for creating loop-based applications.
EG: I want an object in a game to flash every time a 4-bar loop starts over.



audioContext.atTime( callback )
-------------------------------

Being able to set a callback based on the audioContext's internal clock
would be
very useful when visualizing music or timing events to voice-tracks. EG: I
want
to create a tutorial video, and change the text on the screen to present new
information that relates to what is being spoken about in the audio track.
It
would be nice to reply on the audioContext's internal clock to time events
rather than using JavaScript timers that are not in any way linked to the
audio
context.



P O S S I B L E  -  B U G S
===========================



AudioParam ...RampToValueAtTime( 0, 0 )
---------------------------------------

Exponential ramps do not seem to work properly when starting using "zero"
values.
I would expect to hear the sound ramping in from 0 or out to 0, instead the
transition appears to be immediate when it reaches the given time. Much
like the
behavior of the setValueAtTime() method.



BufferSource Connection Cost
----------------------------

The cost of connecting buffer sources seems to increase with the lifetime
of the
audioContext. Every time you create a source and connect it to the
audioContext
it takes slightly longer. This is an interesting behavior I found when
creating
a test.
Received on Monday, 28 November 2011 19:40:32 UTC