- From: Chris Rogers <crogers@google.com>
- Date: Wed, 16 Jun 2010 12:21:36 -0700
- To: Jer Noble <jer.noble@apple.com>
- Cc: public-xg-audio@w3.org
- Message-ID: <AANLkTinwCh8TX0ouQ2AJ3mcosM31NQMvNWtxDA8cLC0h@mail.gmail.com>
Hi Jer, it might be easiest to discuss these ideas offline. It'd be great
to look at it together at a whiteboard, then we can get back to the group
with our results.
On Tue, Jun 15, 2010 at 5:20 PM, Jer Noble <jer.noble@apple.com> wrote:
>
> On Jun 15, 2010, at 3:51 PM, Chris Rogers wrote:
>
> Hi Jer, thanks for your comments. I'll try to address the points you bring
> up:
>
>
>
>> Hi Chris,
>>
>> I'm in the midst of reviewing your spec, and I have a few comments and
>> suggestions:
>>
>>
>> - *Ownership*
>>
>>
>> Building the concept of lifetime-management into the API for AudioNodes
>> seems unnecessary. Each language has its own lifetime-management concepts,
>> and in at least one case (JavaScript) the language lifetime-management will
>> directly contradict the "ownership" model defined here. For example, in
>> JavaScript the direction of the "owner" reference implies that the AudioNode
>> owns its "owner", not vice versa.
>>
>>
> I think the idea of ownership is important and I'll try to explain why.
> There's a difference between the javascript object (the AudioNode) and its
> underlying C++ object which implements its behavior. The ownership for the
> javascript object itself behaves exactly the same as other javascript
> objects with reference counting and garbage collection. However, the
> underlying/backing C++ object may (in some cases) persist after the
> javascript object no longer exists. For example, consider the simple case
> of triggering a sound to play with the following javascript:
>
>
> function playSound() {
> var source = context.createBufferSource();
> source.buffer = dogBarkingBuffer;
> source.connect(context.output);
> source.noteOn(0);
> }
>
> The javascript object *source* may be garbage collected immediately after
> playSound() is called, but the underlying C++ object representing it may
> very well still be connected to the rendering graph generating the sound of
> the barking dog. At some later time when the sound has finished playing, it
> will automatically be removed from the rendering graph in the realtime
> thread (which is running asynchronously from the javascript thread). So,
> strictly speaking the idea of *ownership *comes into play more at the
> level of the underlying C++ objects and not the javascript objects
> themselves. If you keep these ideas in mind while looking at my dynamic
> lifetime example in the specification, maybe things will make a bit more
> sense.
>
>
> Even in that case, the "ownership" seems like an underlying implementation
> detail. If the C++ object can live on after the JavaScript GC has "deleted"
> the JS object, then is the lifetime management concept of the "owner" (as
> exposed in JavaScript) really necessary?
>
> Because it seems like in JavaScript, you already have the ability to create
> one-shot, self-destructing AudioNodes, and the concept of "ownership" is as
> easy to implement as adding a global "var" pointing to an AudioNode. In
> fact, without the "owner" concept, your dynamic lifetime example would work
> exactly the same:
>
> function playSound() {
> var oneShotSound = context.createBufferSource();
> oneShotSound.buffer = dogBarkingBuffer;
>
> // Pass the oneShotSound as the owner so the filter, panner,
> // and mixer input will go away when the sound is done.
> var lowpass = context.createLowPass2Filter(); // no owner
> var panner = context.createPanner(); // no owner
> var mixerInput2 = mixer.createInput(); // no owner
>
> // Make connections
> oneShotSound.connect(lowpass);
> lowpass.connect(panner);
> panner.connect(mixerInput2); // this used to read: panner.connect(mixer)
>
> panner.listener = listener;
>
> oneShotSound.noteOn(0.75);
> }
>
>
> I've modified the example to remove the "owner" params to the constructor
> functions. At the point where "oneShotSound.noteOn(0.75)" is called, there
> is a local reference to *oneShotSound*, *lowpass*, *panner*, *mixer*, and
> *mixerInput2*. Once playSound() returns, those references disappear. *
> oneShotSound* could be immediately GC'd, but it seems to make more sense
> that *oneShotSound* holds a reference to itself as long as it's playing
> (or is scheduled to play).
>
> At some time in the future, the scheduled noteOn() finishes. It releases
> the reference to itself, and thus no one has a reference to *oneShotSound*any longer, so it is GC'd.
> *oneShotSound* was the only holder of a reference to *lowPass*, so *
> lowPass* is GC'd. *lowPass* was the only holder of a reference to *panner
> *, so *panner* is GC'd. And so on.
>
> The end result is that all the filters and sources created inside
> playSound() are removed from the graph as soon as *oneShotSound* finishes
> playing. Which is exactly the same behavior as when the filters have an
> explicit owner. So, I don't see that exposing an "owner" property adds any
> functionality.
>
>
> Additionally, it seems that it's currently impossible to change the "owner"
>> of an AudioNode after that node has been created. Was the "owner" attribute
>> left out of the AudioNode API purposefully?
>>
>>
> *owner* could be added in as a read-only attribute, but I think it is not
> the kind of thing which should change after the fact of creating the object.
>
>
>> - *Multiple Outputs*
>>
>>
>> While there is an explicit AudioMixerNode, there's no equivalent
>> AudioSplitterNode, and thus no explicit way to mux the output of one node to
>> multiple inputs.
>>
>>
> It isn't necessary to have an AudioSplitterNode because it's possible to
> connect an output to multiple inputs directly (this is called *fanout*).
> You may be thinking in terms AudioUnits which require an explicit
> splitter. I remember when we made that design decision with AudioUnits, but
> it is not a problem here.
>
> So *fanout* from an output to multiple inputs is supported without fuss or
> muss.
>
>
> That seems reasonable. The spec should be updated to specifically call
> that out, since it confused the heck out of me. However, see the next
> comment:
>
> In the sample code attached to Section 17, a single source (e.g. source1)
>> is connected to multiple inputs, merely by calling "connect()" multiple
>> times with different input nodes. This doesn't match the AudioNode API,
>> where the "connect()" function takes three parameters, the input node, the
>> output index, and the input index. However, I find the sample code to be a
>> much more natural and easier to use API. Perhaps you may want to consider
>> adopting this model for multiple inputs and outputs everywhere.
>>
>>
> Maybe I should change the API description to be more explicit here, but the
> sample code *does* match the API because the *output* and *input*parameters are optional and default to 0.
>
>
> Okay then, but if every output is capable of connecting to multiple inputs,
> why would you need multiple outputs? Will any AudioNode ever have a
> "numberOfOutputs" > 1, and if so, what functionality does that provide above
> and beyond a single, fanout output?
>
>
>
>>
>> - *Multiple Inputs*
>>
>>
>> This same design could be applied to multiple inputs, as in the case with
>> the mixers. Instead of manually creating inputs, they could also be created
>> dynamically, per-connection.
>>
>> There is an explicit class, AudioMixerNode, which creates
>> AudioMixerInputNodes, demuxes their outputs together, and adjusts the final
>> output gain. It's somewhat strange that the AudioMixerNode can create
>> AudioMixerInputNodes; that seems to be the responsibility of the
>> AudioContext. And it seems that this section could be greatly simplified by
>> dynamically creating inputs.
>>
>> Let me throw out another idea. AudioMixerNode and AudioMixerInputNode
>> would be replaced by an AudioGainNode. Every AudioNode would be capable of
>> becoming an audio mixer by virtue of dynamically-created demuxing inputs.
>> The API would build upon the revised AudioNode above:
>>
>>
>> interface AudioGainNode : AudioNode
>>
>> {
>>
>> AudioGain gain;
>>
>> void addGainContribution(in AudioGain);
>>
>> }
>>
>>
>> The sample code in Section 17 would then go from:
>>
>>
>> mainMixer = context.createMixer();
>> send1Mixer = context.createMixer();
>> send2Mixer = context.createMixer();
>>
>> g1_1 = mainMixer.createInput(source1);
>> g2_1 = send1Mixer.createInput(source1);
>> g3_1 = send2Mixer.createInput(source1);
>> source1.connect(g1_1);
>> source1.connect(g2_1);
>> source1.connect(g3_1);
>>
>>
>> to:
>>
>>
>> mainMixer = context.createGain();
>> send1Mixer = context.createGain();
>> send2Mixer = context.createGain();
>>
>> source2.connect(mainMixer);
>> source2.connect(send1Mixer);
>> source2.connect(send2Mixer);
>>
>> Per-input gain could be achieved by adding an inline AudioGainNode between
>> a source output and its demuxing input node:
>>
>>
>> var g1_1 = context.createGain();
>>
>> source2.connect(g1_1);
>>
>> g1_1.connect(mainMixer);
>>
>> g1_1.gain.value = 0.5;
>>
>>
>> If the default constructor for AudioNodes is changed from "in AudioNode
>> owner" to "in AudioNode input", then a lot of these examples can be cleaned
>> up and shortened. That's just syntactic sugar, however. :)
>>
>>
> It doesn't look like it actually shortens the code to me. And I'm not sure
> we can get rid of the idea of *owner* due to the dynamic lifetime issues I
> tried to describe above. But maybe you can explain some more.
>
>
> Sure thing.
>
> The code above isn't much shorter, granted. However, in your code example,
> *send1Mixer* and *send2Mixer* could then be removed, and each of the
> AudioGainNodes could be connect directly to *reverb* and *chorus*,
> eliminating the need for those mixer nodes. Additionally, if any one of the
> *g#_#* filters is extraneous (in that they will never have a gain != 1.0),
> they can be left out. This has the potential to make the audio graph much,
> much simpler.
>
> Also, in your "playNote()" example above, you have to create a
> AudioMixerNode and AudioMixerInputNode, just to add a gain effect to a
> simple one-shot note. With the above change in API, those two nodes would
> be replaced by a single AudioGainNode.
>
> Also, eliminating the AudioMixerNode interface removes one class from the
> IDL, and eliminates the single piece of API where an AudioNode is created by
> something other than the AudioContext, all without removing any
> functionality.
>
> Let me give some sample code which demonstrates how much shorter the
> client's code could be. From:
>
> function playSound() {
> var source = context.createBufferSource(); source.buffer =
> dogBarkingBuffer; var reverb = context.createReverb();
> source.connect(reverb); var chorus = context.creteChorus();
> source.connect(chorus); var mainMixer = context.createMixer();
> var gain1 = mainMixer.createInput(); reverb.connect(gain1); var gain2 =
> mainMixer.createInput(); chorus.connect(gain2);
> mainMixer.connect(context.output); source.noteOn(0); }
>
>
> to:
>
> function playSound() {
> var source = context.createBufferSource(); source.buffer =
> dogBarkingBuffer; var reverb = context.createReverb();
> source.connect(reverb); var chorus = context.creteChorus();
> source.connect(chorus); reverb.connect(context.output);
>
> chorus.connect(context.output);
>
> source.noteOn(0); }
>
>
> Or the same code with the Constructors addition below:
>
> function playSound() {
> var reverb = context.createReverb(context.output);
> var chorus = context.creteChorus(context.output); var source =
> context.createBufferSource([reverb, chorus]);
> source.buffer = dogBarkingBuffer;
>
> source.noteOn(0);
> }
>
>
>
> Okay, so I kind of cheated and passed in an Array to the
> "createBufferSource()" constructor. But that seems like a simple addition
> which could come in very handy, especially given the "fanout" nature of
> inputs. Taken together, this brings a 13-line function down to 5 lines.
>
> Of course, not all permutations will be as amenable to simplification as
> the function above. But I believe that even the worst case scenario is
> still an improvement.
>
>
>> - *Constructors*
>>
>>
>> Except for the AudioContext.output node, every other created AudioNode
>> needs to be connected to a downstream AudioNode input. For this reason, it
>> seems that the constructor functions should be changed to take an "in
>> AudioNode destination = 0" parameter (instead of an "owner" parameter).
>> This would significantly reduce the amount of code needed to write an
>> audio graph. In addition, anonymous AudioNodes could be created and
>> connected without having to specify local variables:
>>
>> compressor = context.createCompressor(context.output);
>>
>> mainMixer = context.createGain(compressor);
>>
>>
>> or:
>>
>> mainMixer = context.createGain(
>> context.createCompressor(context.output));
>>
>>
> I like the idea, but it may not always be desirable to connect the
> AudioNode immediately upon construction. For example, there may be cases
> where an AudioNode is created, then later passed to some other function
> where it is finally known where it needs to be connected. I'm sure we can
> come up with variants on the constructors to handle the various cases.
>
>
> Oh, I'm not suggesting that constructors replace the connect() function.
> That's why I called this change "syntactic sugar". The same effect could
> be had by returning "this" from the connect() function, allowing such
> constructions as:
>
> source2.connect(g1_2).connect(g2_2).connect(g3_2);
>
> and:
>
> mainMixer =
> context.createGain().connect(context.createCompressor().connect(context.output));
>
>
> But I think the connect-in-the-constructor alternative is less confusing.
>
> Thanks again!
>
> -Jer
>
Received on Wednesday, 16 June 2010 19:22:08 UTC