Re: Web Audio API Proposal from Chris Rogers on 2010-06-15 (public-xg-audio@w3.org from June 2010)

From: Chris Rogers <crogers@google.com>
Date: Tue, 15 Jun 2010 15:51:33 -0700
To: Jer Noble <jer.noble@apple.com>
Cc: public-xg-audio@w3.org
Message-ID: <AANLkTilpvUTu--P4mWb2ztTsFf4CkZQc4Jun0aZC2N_p@mail.gmail.com>
Hi Jer, thanks for your comments.  I'll try to address the points you bring
up:



> Hi Chris,
>
> I'm in the midst of reviewing your spec, and I have a few comments and
> suggestions:
>
>
>    - *Ownership*
>
>
> Building the concept of lifetime-management into the API for AudioNodes
> seems unnecessary.  Each language has its own lifetime-management concepts,
> and in at least one case (JavaScript) the language lifetime-management will
> directly contradict the "ownership" model defined here.  For example, in
> JavaScript the direction of the "owner" reference implies that the AudioNode
> owns its "owner", not vice versa.
>
>
I think the idea of ownership is important and I'll try to explain why.
 There's a difference between the javascript object (the AudioNode) and its
underlying C++ object which implements its behavior.  The ownership for the
javascript object itself behaves exactly the same as other javascript
objects with reference counting and garbage collection.  However, the
underlying/backing C++ object may (in some cases) persist after the
javascript object no longer exists.  For example, consider the simple case
of triggering a sound to play with the following javascript:


            function playSound() {
                var source = context.createBufferSource();
                source.buffer = dogBarkingBuffer;
                source.connect(context.output);
                source.noteOn(0);
            }

The javascript object *source* may be garbage collected immediately after
playSound() is called, but the underlying C++ object representing it may
very well still be connected to the rendering graph generating the sound of
the barking dog.  At some later time when the sound has finished playing, it
will automatically be removed from the rendering graph in the realtime
thread (which is running asynchronously from the javascript thread).  So,
strictly speaking the idea of *ownership *comes into play more at the level
of the underlying C++ objects and not the javascript objects themselves.  If
you keep these ideas in mind while looking at my dynamic lifetime example in
the specification, maybe things will make a bit more sense.

Additionally, it seems that it's currently impossible to change the "owner"
> of an AudioNode after that node has been created. Was the "owner" attribute
> left out of the AudioNode API purposefully?
>
>
*owner* could be added in as a read-only attribute, but I think it is not
the kind of thing which should change after the fact of creating the object.





>
>    - *Multiple Outputs*
>
>
> While there is an explicit AudioMixerNode, there's no equivalent
> AudioSplitterNode, and thus no explicit way to mux the output of one node to
> multiple inputs.
>
>
It isn't necessary to have an AudioSplitterNode because it's possible to
connect an output to multiple inputs directly (this is called *fanout*).
 You may be thinking in terms  AudioUnits which require an explicit
splitter.  I remember when we made that design decision with AudioUnits, but
it is not a problem here.

So *fanout* from an output to multiple inputs is supported without fuss or
muss.





> In the sample code attached to Section 17, a single source (e.g. source1)
> is connected to multiple inputs, merely by calling "connect()" multiple
> times with different input nodes.  This doesn't match the AudioNode API,
> where the "connect()" function takes three parameters, the input node, the
> output index, and  the input index.  However, I find the sample code to be a
> much more natural and easier to use API.  Perhaps you may want to consider
> adopting this model for multiple inputs and outputs everywhere.
>
>
Maybe I should change the API description to be more explicit here, but the
sample code *does* match the API because the *output* and *input* parameters
are optional and default to 0.






>
>
>    - *Multiple Inputs*
>
>
> This same design could be applied to multiple inputs, as in the case with
> the mixers.  Instead of manually creating inputs, they could also be created
> dynamically, per-connection.
>
> There is an explicit class, AudioMixerNode, which creates
> AudioMixerInputNodes,  demuxes their outputs together, and adjusts the final
> output gain.  It's somewhat strange that the AudioMixerNode can create
> AudioMixerInputNodes; that seems to be the responsibility of the
> AudioContext.  And it seems that this section could be greatly simplified by
> dynamically creating inputs.
>
> Let me throw out another idea.  AudioMixerNode and AudioMixerInputNode
> would be replaced by an AudioGainNode.  Every AudioNode would be capable of
> becoming an audio mixer by virtue of dynamically-created demuxing inputs.
>  The API would build upon the revised AudioNode above:
>
>
> interface AudioGainNode : AudioNode
>
> {
>
>         AudioGain gain;
>
>         void addGainContribution(in AudioGain);
>
> }
>
>
> The sample code in Section 17 would then go from:
>
>
>     mainMixer = context.createMixer();
>     send1Mixer = context.createMixer();
>     send2Mixer = context.createMixer();
>
>     g1_1 = mainMixer.createInput(source1);
>     g2_1 = send1Mixer.createInput(source1);
>     g3_1 = send2Mixer.createInput(source1);
>     source1.connect(g1_1);
>     source1.connect(g2_1);
>     source1.connect(g3_1);
>
>
> to:
>
>
>     mainMixer = context.createGain();
>     send1Mixer = context.createGain();
>     send2Mixer = context.createGain();
>
>     source2.connect(mainMixer);
>     source2.connect(send1Mixer);
>     source2.connect(send2Mixer);
>
> Per-input gain could be achieved by adding an inline AudioGainNode between
> a source output and its demuxing input node:
>
>
> var g1_1 = context.createGain();
>
> source2.connect(g1_1);
>
> g1_1.connect(mainMixer);
>
> g1_1.gain.value = 0.5;
>
>
> If the default constructor for AudioNodes is changed from "in AudioNode
> owner" to "in AudioNode input", then a lot of these examples can be cleaned
> up and shortened.  That's just syntactic sugar, however. :)
>
>
It doesn't look like it actually shortens the code to me.  And I'm not sure
we can get rid of the idea of *owner* due to the dynamic lifetime issues I
tried to describe above.  But maybe you can explain some more.







>
>
>    - *Constructors*
>
>
> Except for the AudioContext.output node, every other created AudioNode
> needs to be connected to a downstream AudioNode input.  For this reason, it
> seems that the constructor functions should be changed to take an "in
> AudioNode destination = 0" parameter (instead of an "owner" parameter).
>  This would significantly reduce the amount of code needed to  write an
> audio graph.  In addition, anonymous AudioNodes could be created and
> connected without having to specify local variables:
>
>     compressor = context.createCompressor(context.output);
>
>     mainMixer = context.createGain(compressor);
>
>
> or:
>
>     mainMixer = context.createGain(
> context.createCompressor(context.output));
>
>
I like the idea, but it may not always be desirable to connect the AudioNode
immediately upon construction.  For example, there may be cases where an
AudioNode is created, then later passed to some other function where it is
finally known where it needs to be connected.  I'm sure we can come up with
variants on the constructors to handle the various cases.

Jer, thanks again for your comments.  I appreciate it...
Chris
Received on Tuesday, 15 June 2010 22:52:04 UTC