Re: Channel up-mixing / down-mixing, how to compute num channels of connection? from s p on 2013-07-17 (public-audio@w3.org from July to September 2013)

From: s p <sebpiq@gmail.com>
Date: Thu, 18 Jul 2013 01:24:07 +0400
To: Ehsan Akhgari <ehsan.akhgari@gmail.com>
Cc: "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CAGKuoCUo8Yk1KcJh1EoAkiotCX7koLi7=ji4RyMkzdXT-m_Bpw@mail.gmail.com>
Quick answer, to not be too much off topic :) and then I'll definitely
contact you offline if I have questions.

> Note that the number of input/output channels for a given AudioNode might
change

The computedNumberOfChannels is only recalculated each time the input gains
or looses a connection :
https://github.com/sebpiq/node-web-audio-api/blob/master/lib/audioports.js#L57
This means that if the number of channels of one node's output varies in
time, this won't work (but why would it vary?).

> When up-mixing in discrete mode, you should zero-fill the channels that
do not appear on the input(s)

This is done automatically when creating an Float32Array with the right
size. It is filled with zeros.

> Also, on line 113 I think you meant to check for "speaker"

Yes ... that's right :)

Thanks for the code review! I'll definitely check-out the test suite.


2013/7/18 Ehsan Akhgari <ehsan.akhgari@gmail.com>

>
> On Wed, Jul 17, 2013 at 3:05 PM, s p <sebpiq@gmail.com> wrote:
>
>> > Hmm out of curiosity, how can you do this lazily?
>>
>> About this, you can check-out my code (keep in mind this is a
>> work-in-progress, not released yet) *
>> https://github.com/sebpiq/node-web-audio-api/blob/master/lib/audioports.js#L84
>> *
>>
>
> That's some serious amount of JS.  :-)  Hope I'm not misunderstanding this.
>
>
>> Basically, the way I am thinking of doing right now is pulling the audio
>> from bottom to top of the graph :
>>
>> ... -> sink -> sink's inputs -> source's outputs -> source -> ...
>>
>> So - only if 'computedNumberOfChannels' hasn't been computed yet - each
>> input when it receives the blocks of audio pulled from the outputs
>> upstream, inspects them to see how many channels they have, and then
>> computes 'computedNumberOfChannels'. This is quite straightforward in fact.
>> But if you see a problem with this approach please tell me! I haven't yet
>> dived into implementing AudioNodes.
>>
>
> Hmm, do you perform this calculation on each block?  Note that the number
> of input/output channels for a given AudioNode might change as you're
> running the graph from one block to the next one, so you should make sure
> to clear 'computedNumberOfChannels' on each iteration.  Also, looking at
> your code, two problems caught my eyes.  When up-mixing in discrete mode,
> you should zero-fill the channels that do not appear on the input(s) (JS
> might do this for you automatically -- just something to check for.)  Also,
> on line 113 I think you meant to check for "speaker".
>
> If you'd like to test your implementation, I suggest taking a look at <
> http://mxr.mozilla.org/mozilla-central/source/content/media/webaudio/test/test_mixingRules.html?force=1>.
> I borrowed this test from Blink and modified it to test a number of cases
> that the Blink test did not cover.  Gecko currently passes this test (Blink
> doesn't last I checked), and I believe that the test is fairly exhaustive
> in covering all of the edge cases.
>
> If you'd like more help, please feel free to contact me offline.  I think
> we're getting sort of off-topic here.  :-)
>
> Cheers,
> --
> Ehsan
> <http://ehsanakhgari.org/>
>
>
>
>>
>> 2013/7/17 s p <sebpiq@gmail.com>
>>
>>> > Can you suggest how we can reword the current prose to make this more
>>> evident?
>>>
>>> Ok ... Let's try :)
>>>
>>> I tried to rewrite the whole paragraph, mostly shuffling around the info
>>> and adding some bits here and there ... and - to me - it would be very
>>> clear written like that :
>>>
>>>  An AudioNode input uses three basic pieces of information to determine
>>> how to mix all the outputs connected to it. As part of this process, the
>>> AudioNode computes an internal value computedNumberOfChannels representing
>>> the actual number of channels of the input at any given time. The AudioNode
>>> attributes involved in channel up-mixing and down-mixing rules are defined
>>> above.
>>>
>>> For each input of an AudioNode, an implementation must:
>>>
>>>     (i) Compute computedNumberOfChannels.
>>>     (ii) For each connection to the input:
>>>         up-mix or down-mix the connection to computedNumberOfChannels.
>>>         Mix it together with all of the other mixed streams (from other
>>> connections). This is a straight-forward mixing together of each of the
>>> corresponding channels from each connection.
>>>
>>>  (i) The algorithm to determine computedNumberOfChannels uses
>>> channelCount and channelCountMode. It also requires to know the number of
>>> channels of each output implied in the connection, so any implementation
>>> should be able to determine those values. For example, some of the nodes'
>>> outputs will have a pre-determined number of channels.
>>>  channelCountMode determines how computedNumberOfChannels will be
>>> computed. For most nodes, the default value of channelCountMode is "max".
>>>
>>>         “max”: computedNumberOfChannels is computed as the maximum of
>>> the number of channels of outputs implied in the connection. In this mode
>>> channelCount is ignored.
>>>         “clamped-max”: same as “max” up to a limit of the channelCount
>>>         “explicit”: computedNumberOfChannels is the exact value as
>>> specified in channelCount
>>>
>>> (ii) channelInterpretation determines how the individual channels will
>>> be treated. For example, will they be treated as speakers having a specific
>>> layout, or will they be treated as simple discrete channels? This value
>>> influences exactly how the up and down mixing is performed. The default
>>> value is "speakers".
>>>
>>>         “speakers”: use up-down-mix equations for mono/stereo/quad/5.1.
>>> In cases where the number of channels do not match any of these basic
>>> speaker layouts, revert to "discrete".
>>>         “discrete”: up-mix by filling channels until they run out then
>>> zero out remaining channels. down-mix by filling as many channels as
>>> possible, then dropping remaining channels
>>>
>>>
>>>
>>>
>>> 2013/7/17 Ehsan Akhgari <ehsan.akhgari@gmail.com>
>>>
>>>>
>>>> On Wed, Jul 17, 2013 at 11:55 AM, s p <sebpiq@gmail.com> wrote:
>>>>
>>>>> Yes ... this makes sense I think! But then the spec lacks info about
>>>>> that.
>>>>>
>>>>
>>>> Can you suggest how we can reword the current prose to make this more
>>>> evident?
>>>>
>>>>
>>>>> My current implementation simply does it lazily. The number of
>>>>> channels is not computed until a block of audio is pulled. When that
>>>>> happens, the node simply looks at the audio buffer from each output, and
>>>>> checks how many channels it has, using this information to finally
>>>>> calculate the computedNumberOfChannels.
>>>>> So I guess this is equivalent to what you wrote (or at least it has
>>>>> the same result).
>>>>>
>>>>
>>>> Hmm out of curiosity, how can you do this lazily?  I don't think you
>>>> can avoid processing nodes which are either directly or indirectly
>>>> connected to the destination node, except for perhaps non-source nodes
>>>> which do not have any inputs (I believe most if not all of them should just
>>>> produce silence in that case...)
>>>>
>>>>
>>>>> Thanks for the explanation :)
>>>>>
>>>>
>>>> Happy to help!
>>>>
>>>> --
>>>> Ehsan
>>>> <http://ehsanakhgari.org/>
>>>>
>>>>
>>>>
>>>>> 2013/7/17 Ehsan Akhgari <ehsan.akhgari@gmail.com>
>>>>>
>>>>>> Let me try to clarify.  Some of the nodes in the graph have
>>>>>> pre-determined number of channels, such as AudioBufferSourceNode or
>>>>>> PannerNode.  Some other types follow the mixing rules.  A pseudo-code
>>>>>> algorithm like below will give you the number of channels that currentNodes
>>>>>> sees on its *input*:
>>>>>>
>>>>>> function GetInputChannelCount(array<AudioNode> inputNodes, AudioNode
>>>>>> currentNode) {
>>>>>>   if (currentNode is AudioBufferSourceNode or PannerNode etc) {
>>>>>>     skip the computation; // because the number of input channels
>>>>>> doesn't matter
>>>>>>   } else {
>>>>>>     if (currentNode.channelCountMode == "explicit") {
>>>>>>       return currentNode.channelCount;
>>>>>>     }
>>>>>>     var numberOfChannels = 1;
>>>>>>     for each (node in inputNodes) {
>>>>>>       numberOfChannels = max(numberOfChannels,
>>>>>> node.channelsProducedByNode);
>>>>>>     }
>>>>>>     if (currentNode.channelCountMode == "clamped-max") {
>>>>>>       numberOfChannels = min(numberOfChannels,
>>>>>> currentNode.channelCount);
>>>>>>     }
>>>>>>     return numberOfChannels;
>>>>>>   }
>>>>>> }
>>>>>>
>>>>>> Once you have the result of GetInputChannelCount, you look at the
>>>>>> buffer produced by each input node, up/down-mix it to the correct channel
>>>>>> count according to the mixing rules, and then mix all of the input buffers
>>>>>> together, and pass that as the input buffer to currentNode.  The same thing
>>>>>> happens recursively starting from the source nodes in your graph until you
>>>>>> get to the destination node.
>>>>>>
>>>>>> Does this make sense?
>>>>>>
>>>>>> --
>>>>>> Ehsan
>>>>>> <http://ehsanakhgari.org/>
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 16, 2013 at 6:46 PM, s p <sebpiq@gmail.com> wrote:
>>>>>>
>>>>>>> This still isn't clear - at least for me.
>>>>>>>
>>>>>>> """
>>>>>>>
>>>>>>> An AudioNode input use three basic pieces of information to
>>>>>>> determine how to mix all the outputs connected to it. As part of this
>>>>>>> process it computes an internal value computedNumberOfChannelsrepresenting the actual number of channels of the input at any given time:
>>>>>>>
>>>>>>> """
>>>>>>>
>>>>>>> The above explicitely says that the algorithm only applies to inputs
>>>>>>> and not to outputs.
>>>>>>>
>>>>>>> """
>>>>>>>
>>>>>>> “max”: computedNumberOfChannels is computed as the maximum of the
>>>>>>> number of channels of all connections. In this mode channelCount is
>>>>>>> ignored.
>>>>>>>
>>>>>>> """
>>>>>>>
>>>>>>> The above says that if "channelCountMode" is "max" you need to take
>>>>>>> the max number of channels of all connections (i.e., the max number of
>>>>>>> channels of outputs)  which implies that you should be able to compute
>>>>>>> outputs' channel count somehow. But how to do that isn't specified anywhere.
>>>>>>>
>>>>>>>
>>>>>>> 2013/7/17 Ehsan Akhgari <ehsan.akhgari@gmail.com>
>>>>>>>
>>>>>>>> The number of channels for each node's output are determined by
>>>>>>>> this algorithm.  There are some nodes which force this value to a certain
>>>>>>>> value (for example, PannerNode) but most node types follow this algorithm.
>>>>>>>>
>>>>>>>> Note that the channelCount for a given node cannot be trusted
>>>>>>>> unless channelCountMode is "explicit".
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ehsan
>>>>>>>> <http://ehsanakhgari.org/>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 15, 2013 at 1:44 PM, s p <sebpiq@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Or is it so that the max number of channels from outputs is
>>>>>>>>> inferred by the number of channels in the buffers received from upstream?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2013/7/14 s p <sebpiq@gmail.com>
>>>>>>>>>
>>>>>>>>>> Hi!
>>>>>>>>>>
>>>>>>>>>> Reading the chapter "9 - channel up-mixing / down-mixing", from
>>>>>>>>>> what I understand the number of channels should be computed for each input.
>>>>>>>>>> However it says :
>>>>>>>>>>
>>>>>>>>>> > “max”: computedNumberOfChannels is computed as *[the maximum
>>>>>>>>>> of the number of channels of all connections]*.
>>>>>>>>>>
>>>>>>>>>> So how I understand it is that you actually need the number of
>>>>>>>>>> channels of each output for computing "computedNumberOfChannels".
>>>>>>>>>> But I couldn't quite figure out how to compute the number of channels of an
>>>>>>>>>> ouput. Is the node's raw channelCount?
>>>>>>>>>>
>>>>>>>>>> Or did I get it all wrong?
>>>>>>>>>>
>>>>>>>>>> Sebastien Piquemal
>>>>>>>>>>
>>>>>>>>>> PS : for info, I am in the process of implementing the Web Audio
>>>>>>>>>> API spec for Node.js (
>>>>>>>>>> https://github.com/sebpiq/node-web-audio-api).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Received on Wednesday, 17 July 2013 21:24:35 UTC