Re: AudioNode API Review Part 2 (Response to spec) from Chris Rogers on 2010-10-07 (public-xg-audio@w3.org from October 2010)

From: Chris Rogers <crogers@google.com>
Date: Thu, 7 Oct 2010 13:46:12 -0700
To: Joseph Berkovitz <joe@noteflight.com>
Cc: public-xg-audio@w3.org
Message-ID: <AANLkTik2-6YY4mi0TQXDQYVeiKUUDLzWcntEFAFjvPdK@mail.gmail.com>
On Wed, Oct 6, 2010 at 10:48 AM, Joseph Berkovitz <joe@noteflight.com>wrote:

> Hi all,
>
> This is the 2nd installment of my review of the Web API.  This is not a
> comparison, but more of a response to the API on its own merits.
>
> ------------------------
>
> *AudioNode*
> *
> *
> Will there be an API convention whereby input/output indices are symbolic
> constants in the corresponding Node interface?
>

I think this could make sense in some of the AudioNodes, for example
AudioChannelSplitter and AudioChannelMerger.


>
> *AudioParam*
> *
> *
> This appears to be at least partly metadata.  So perhaps there can be some
> attribute in AudioNode (*audioParams?*) that returns an array of the
> AudioParams associated with the concrete Node instance, to make the class's
> metadata discoverable in a regular way.
>

I included exactly such a discovery mechanism when I designed the parameter
API  for Audio Units:
http://developer.apple.com/audio/audiocommunity.html

This is very important for a generic plugin architectures where a host audio
application needs to load an arbitrary audio plugin and present a generic UI
for editing the parameters.  In the case of the AudioNode API, all of the
possible types of AudioNodes are already known in the specification, so it's
not as important to be able to discover the parameters in this way.  If we
get to the point where web developers can create and deliver their own
custom AudioNodes as is possible with Audio Units then we may need this
feature.  But doing this requires lots and lots of work dealing with
security issues (loading untrusted code) and is not yet possible.


>
> *DelayNode*
> *
> *
> It feels important to understand the philosophy and the resource
> consumption of a DelayNode.  Does it have the "auto-magical" effect of
> causing an isolated subgraph upstream from it to be scheduled to render
> itself later than normal (which would use minimal memory), or does it
> actually buffer the delayed samples from its input regardless?  If the
> former, then that's important to know, and it makes DelayNode a valuable
> building block in audio scheduling (a la the Performance object I described
> in previous posts).  If the latter, then it feels important to communicate
> its likely resource usage in the spec.
>

It is the latter.  It's a normal digital delay building block where the
delay time can be smoothly changed.  It is not intended for scheduling.




> *AudioBufferSourceNode*
> *
> *
> Needs an integer r/w attribute *loopStart* at a minimum.  *loopEnd* would
> be useful too, since many wavetable samples are supplied by vendors with
> both a start and end frame (even though the portion after the end frame is
> useless -- go figure!).
>

Yes, I need to work on getting looping support into AudioBufferSourceNode as
you suggest.  Actually, the portion after the end frame is *useful* because
when the release phase of the envelope is entered the loop may finish and
play these samples after the end of the loop.


>
> What happens when noteOn() is invoked prior to the node's connection to a
> destination?  If the connection occurs before the note plays, is the play
> request honored on a delayed basis?  If the connection follows playback,
> does the note start playing in the middle?
>

It should be fine to invoke noteOn() before the node is connected.  When it
is connected then the note will start playing immediately if the scheduled
time has already passed.  Or it will play at its scheduled time otherwise.
 This is exactly the same as if noteOn() is called after the node is
connected.


>
> I don't like the names noteOn()/noteOff(), they feel a little too musical
> (this coming from a music guy, but, hey...)  I prefer startAt() and
> stopAt(), which also transfer to the "automation/modulation" realm without
> requiring a name change.
>

Yes, these might be better names.



>
> What happens when playbackRate is dynamically changed?  If a note is
> playing, Is this done in a way that preserves the current sample frame
> counter to avoid generating artifacts?
>

Yes, it definitely preserves the current sample frame and doesn't generate
artifacts.


>
> noteGrainOn() seems too clever.  What if one doesn't want a "grain window"
> with a smooth fadeout?  I would prefer a clear way to play a slice of some
> sample with no accompanying smarts; windowing can be accomplished by a
> downstream node.  I think the "grain" idea is overspecification, I want
> noteSegmentOn() or something like that.
>

I think that once we're able to attach amplitude envelope curves to
AudioBufferSourceNode then a "grain window" will just be an envelope.  I'm
considering here that envelopes can be more complex shapes than traditional
ADSR.  I think that an envelope can be an AudioCurve with the extra smarts
to deal with "gate-on" and "gate-off".


>
> I think the envelope aspects of note playback need to be cleared up.  Is
> there going to be some additional attribute representing an envelope, or
> will envelopes be supplied by downstream nodes?
>

That's a good question.  My initial inclination is to suggest that it should
be a simple attribute in AudioBufferSourceNode.  But on further reflection,
an amplitude envelope could be supplied by an AudioGainNode with an attached
AudioCurve (on its "gain" attribute).  Then it could be inserted at
arbitrary points in the graph, not just directly after an
AudioBufferSourceNode.



*
>
> Dynamic Lifetime

*

*


*

This is a little unclear on the all-important question of when nodes are
added to the graph -- it speaks more in terms of tear-down.


The Digital Sheet Music use case (which perhaps is already addressed by the
current implementation) encourages the creation of a very large number
(often in the tens of thousands) of "note" style nodes which represent the
events in a score, and which are all connected to a single destination.
 Ideally this is not a problem, since at any given time only a small number
(often 10 or less) are actually being played.  I can imagine the
implementation of Dynamic Lifetime taking care of this just fine by
"protecting" the destination node from seeing notes that are not currently
active.


If this is true, then it should be made clear. This aspect of the Web API is
one of its most important features, and programmers will really need to
understand its capabilities in order to take advantage of it properly.


If this is not true, then that also needs clarification, since developers
will have to play annoying tricks and games to avoid overscheduling too many
nodes and overwhelming the engine.  Hope this isn't the case.


I believe that in the vast majority of use cases, the "notes" which are
scheduled will either be scheduled to play "now" or in the very near future.
 Scheduling very large numbers of notes many minutes into the future isn't
something which I would recommend.  In your main use case (playing back a
MIDI-like sequence), it's really not that difficult to schedule notes as
time goes by.  But, I do agree that as many optimizations as possible should
be made in the engine to avoid processing overhead when notes have been
scheduled to play, but have not yet begun.  So to phrase this another way:

* Document best practices of API use: suggest to not schedule many minutes
into the future
* Underlying engine implementation: have as many optimizations as possible
to handle large processing graphs efficiently

Best Regards,
Chris
Received on Thursday, 7 October 2010 20:46:44 UTC