Re: Channel up-mixing / down-mixing, how to compute num channels of connection? from Ehsan Akhgari on 2013-07-17 (public-audio@w3.org from July to September 2013)

From: Ehsan Akhgari <ehsan.akhgari@gmail.com>
Date: Wed, 17 Jul 2013 16:04:48 -0400
To: s p <sebpiq@gmail.com>
Cc: "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CANTur_48B7xBnZKvB2RFWrUpg-RM0zSHkjm50uhYyTz5z38GBw@mail.gmail.com>
On Wed, Jul 17, 2013 at 3:05 PM, s p <sebpiq@gmail.com> wrote:

> > Hmm out of curiosity, how can you do this lazily?
>
> About this, you can check-out my code (keep in mind this is a
> work-in-progress, not released yet) *
> https://github.com/sebpiq/node-web-audio-api/blob/master/lib/audioports.js#L84
> *
>

That's some serious amount of JS.  :-)  Hope I'm not misunderstanding this.


> Basically, the way I am thinking of doing right now is pulling the audio
> from bottom to top of the graph :
>
> ... -> sink -> sink's inputs -> source's outputs -> source -> ...
>
> So - only if 'computedNumberOfChannels' hasn't been computed yet - each
> input when it receives the blocks of audio pulled from the outputs
> upstream, inspects them to see how many channels they have, and then
> computes 'computedNumberOfChannels'. This is quite straightforward in fact.
> But if you see a problem with this approach please tell me! I haven't yet
> dived into implementing AudioNodes.
>

Hmm, do you perform this calculation on each block?  Note that the number
of input/output channels for a given AudioNode might change as you're
running the graph from one block to the next one, so you should make sure
to clear 'computedNumberOfChannels' on each iteration.  Also, looking at
your code, two problems caught my eyes.  When up-mixing in discrete mode,
you should zero-fill the channels that do not appear on the input(s) (JS
might do this for you automatically -- just something to check for.)  Also,
on line 113 I think you meant to check for "speaker".

If you'd like to test your implementation, I suggest taking a look at <
http://mxr.mozilla.org/mozilla-central/source/content/media/webaudio/test/test_mixingRules.html?force=1>.
I borrowed this test from Blink and modified it to test a number of cases
that the Blink test did not cover.  Gecko currently passes this test (Blink
doesn't last I checked), and I believe that the test is fairly exhaustive
in covering all of the edge cases.

If you'd like more help, please feel free to contact me offline.  I think
we're getting sort of off-topic here.  :-)

Cheers,
--
Ehsan
<http://ehsanakhgari.org/>



>
> 2013/7/17 s p <sebpiq@gmail.com>
>
>> > Can you suggest how we can reword the current prose to make this more
>> evident?
>>
>> Ok ... Let's try :)
>>
>> I tried to rewrite the whole paragraph, mostly shuffling around the info
>> and adding some bits here and there ... and - to me - it would be very
>> clear written like that :
>>
>>  An AudioNode input uses three basic pieces of information to determine
>> how to mix all the outputs connected to it. As part of this process, the
>> AudioNode computes an internal value computedNumberOfChannels representing
>> the actual number of channels of the input at any given time. The AudioNode
>> attributes involved in channel up-mixing and down-mixing rules are defined
>> above.
>>
>> For each input of an AudioNode, an implementation must:
>>
>>     (i) Compute computedNumberOfChannels.
>>     (ii) For each connection to the input:
>>         up-mix or down-mix the connection to computedNumberOfChannels.
>>         Mix it together with all of the other mixed streams (from other
>> connections). This is a straight-forward mixing together of each of the
>> corresponding channels from each connection.
>>
>>  (i) The algorithm to determine computedNumberOfChannels uses
>> channelCount and channelCountMode. It also requires to know the number of
>> channels of each output implied in the connection, so any implementation
>> should be able to determine those values. For example, some of the nodes'
>> outputs will have a pre-determined number of channels.
>>  channelCountMode determines how computedNumberOfChannels will be
>> computed. For most nodes, the default value of channelCountMode is "max".
>>
>>         “max”: computedNumberOfChannels is computed as the maximum of the
>> number of channels of outputs implied in the connection. In this mode
>> channelCount is ignored.
>>         “clamped-max”: same as “max” up to a limit of the channelCount
>>         “explicit”: computedNumberOfChannels is the exact value as
>> specified in channelCount
>>
>> (ii) channelInterpretation determines how the individual channels will be
>> treated. For example, will they be treated as speakers having a specific
>> layout, or will they be treated as simple discrete channels? This value
>> influences exactly how the up and down mixing is performed. The default
>> value is "speakers".
>>
>>         “speakers”: use up-down-mix equations for mono/stereo/quad/5.1.
>> In cases where the number of channels do not match any of these basic
>> speaker layouts, revert to "discrete".
>>         “discrete”: up-mix by filling channels until they run out then
>> zero out remaining channels. down-mix by filling as many channels as
>> possible, then dropping remaining channels
>>
>>
>>
>>
>> 2013/7/17 Ehsan Akhgari <ehsan.akhgari@gmail.com>
>>
>>>
>>> On Wed, Jul 17, 2013 at 11:55 AM, s p <sebpiq@gmail.com> wrote:
>>>
>>>> Yes ... this makes sense I think! But then the spec lacks info about
>>>> that.
>>>>
>>>
>>> Can you suggest how we can reword the current prose to make this more
>>> evident?
>>>
>>>
>>>> My current implementation simply does it lazily. The number of channels
>>>> is not computed until a block of audio is pulled. When that happens, the
>>>> node simply looks at the audio buffer from each output, and checks how many
>>>> channels it has, using this information to finally calculate the
>>>> computedNumberOfChannels.
>>>> So I guess this is equivalent to what you wrote (or at least it has the
>>>> same result).
>>>>
>>>
>>> Hmm out of curiosity, how can you do this lazily?  I don't think you can
>>> avoid processing nodes which are either directly or indirectly connected to
>>> the destination node, except for perhaps non-source nodes which do not have
>>> any inputs (I believe most if not all of them should just produce silence
>>> in that case...)
>>>
>>>
>>>> Thanks for the explanation :)
>>>>
>>>
>>> Happy to help!
>>>
>>> --
>>> Ehsan
>>> <http://ehsanakhgari.org/>
>>>
>>>
>>>
>>>> 2013/7/17 Ehsan Akhgari <ehsan.akhgari@gmail.com>
>>>>
>>>>> Let me try to clarify.  Some of the nodes in the graph have
>>>>> pre-determined number of channels, such as AudioBufferSourceNode or
>>>>> PannerNode.  Some other types follow the mixing rules.  A pseudo-code
>>>>> algorithm like below will give you the number of channels that currentNodes
>>>>> sees on its *input*:
>>>>>
>>>>> function GetInputChannelCount(array<AudioNode> inputNodes, AudioNode
>>>>> currentNode) {
>>>>>   if (currentNode is AudioBufferSourceNode or PannerNode etc) {
>>>>>     skip the computation; // because the number of input channels
>>>>> doesn't matter
>>>>>   } else {
>>>>>     if (currentNode.channelCountMode == "explicit") {
>>>>>       return currentNode.channelCount;
>>>>>     }
>>>>>     var numberOfChannels = 1;
>>>>>     for each (node in inputNodes) {
>>>>>       numberOfChannels = max(numberOfChannels,
>>>>> node.channelsProducedByNode);
>>>>>     }
>>>>>     if (currentNode.channelCountMode == "clamped-max") {
>>>>>       numberOfChannels = min(numberOfChannels,
>>>>> currentNode.channelCount);
>>>>>     }
>>>>>     return numberOfChannels;
>>>>>   }
>>>>> }
>>>>>
>>>>> Once you have the result of GetInputChannelCount, you look at the
>>>>> buffer produced by each input node, up/down-mix it to the correct channel
>>>>> count according to the mixing rules, and then mix all of the input buffers
>>>>> together, and pass that as the input buffer to currentNode.  The same thing
>>>>> happens recursively starting from the source nodes in your graph until you
>>>>> get to the destination node.
>>>>>
>>>>> Does this make sense?
>>>>>
>>>>> --
>>>>> Ehsan
>>>>> <http://ehsanakhgari.org/>
>>>>>
>>>>>
>>>>> On Tue, Jul 16, 2013 at 6:46 PM, s p <sebpiq@gmail.com> wrote:
>>>>>
>>>>>> This still isn't clear - at least for me.
>>>>>>
>>>>>> """
>>>>>>
>>>>>> An AudioNode input use three basic pieces of information to determine
>>>>>> how to mix all the outputs connected to it. As part of this process it
>>>>>> computes an internal value computedNumberOfChannels representing the
>>>>>> actual number of channels of the input at any given time:
>>>>>>
>>>>>> """
>>>>>>
>>>>>> The above explicitely says that the algorithm only applies to inputs
>>>>>> and not to outputs.
>>>>>>
>>>>>> """
>>>>>>
>>>>>> “max”: computedNumberOfChannels is computed as the maximum of the
>>>>>> number of channels of all connections. In this mode channelCount is
>>>>>> ignored.
>>>>>>
>>>>>> """
>>>>>>
>>>>>> The above says that if "channelCountMode" is "max" you need to take
>>>>>> the max number of channels of all connections (i.e., the max number of
>>>>>> channels of outputs)  which implies that you should be able to compute
>>>>>> outputs' channel count somehow. But how to do that isn't specified anywhere.
>>>>>>
>>>>>>
>>>>>> 2013/7/17 Ehsan Akhgari <ehsan.akhgari@gmail.com>
>>>>>>
>>>>>>> The number of channels for each node's output are determined by this
>>>>>>> algorithm.  There are some nodes which force this value to a certain value
>>>>>>> (for example, PannerNode) but most node types follow this algorithm.
>>>>>>>
>>>>>>> Note that the channelCount for a given node cannot be trusted unless
>>>>>>> channelCountMode is "explicit".
>>>>>>>
>>>>>>> --
>>>>>>> Ehsan
>>>>>>> <http://ehsanakhgari.org/>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 15, 2013 at 1:44 PM, s p <sebpiq@gmail.com> wrote:
>>>>>>>
>>>>>>>> Or is it so that the max number of channels from outputs is
>>>>>>>> inferred by the number of channels in the buffers received from upstream?
>>>>>>>>
>>>>>>>>
>>>>>>>> 2013/7/14 s p <sebpiq@gmail.com>
>>>>>>>>
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> Reading the chapter "9 - channel up-mixing / down-mixing", from
>>>>>>>>> what I understand the number of channels should be computed for each input.
>>>>>>>>> However it says :
>>>>>>>>>
>>>>>>>>> > “max”: computedNumberOfChannels is computed as *[the maximum of
>>>>>>>>> the number of channels of all connections]*.
>>>>>>>>>
>>>>>>>>> So how I understand it is that you actually need the number of
>>>>>>>>> channels of each output for computing "computedNumberOfChannels".
>>>>>>>>> But I couldn't quite figure out how to compute the number of channels of an
>>>>>>>>> ouput. Is the node's raw channelCount?
>>>>>>>>>
>>>>>>>>> Or did I get it all wrong?
>>>>>>>>>
>>>>>>>>> Sebastien Piquemal
>>>>>>>>>
>>>>>>>>> PS : for info, I am in the process of implementing the Web Audio
>>>>>>>>> API spec for Node.js (https://github.com/sebpiq/node-web-audio-api
>>>>>>>>> ).
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Received on Wednesday, 17 July 2013 20:06:04 UTC