Re: Reconciling ConvolverNode's output channel dependencies with the mixing rules in the spec

On Thu, May 16, 2013 at 1:44 PM, Ehsan Akhgari <ehsan.akhgari@gmail.com>wrote:

> Currently in the spec most nodes have a notion of input channel count,
> which means that the processing code can have assumptions about the number
> of input channels.  This is not the case about the number of output
> channels though.
>
> According to <
> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#UpMix>,
> the number of channels that an AudioNode produces on its output doesn't
> have anything to do with the number of channels that the nodes connected to
> it will see in their output, as the up/down-mixing happens at the input to
> each node.  In other words, the implementation of a node is free to to
> choose the number of output channels that it wants without needing to worry
> about what other nodes expect.  (Of course, assuming that the chosen number
> of output channels makes sense, but let's grant that assumption for now.)
>
> ConvolverNode, however, deviates from this.  In <
> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Convolution-reverb-effect>,
> the spec denotes a matrix of different types of processing that needs to
> happen based on the number of input channels, number of channels in the
> impulse response buffer, and the number of output channels.  I find that
> incompatible with the mixing rules for AudioNodes in the spec.  An
> AudioNode cannot make assumptions about the number of output channels in
> any meaningful way.  For example, we can connect a ConvolverNode to two
> AudioNodes, one with channelCount=2 and channelCountMode="explicit" and one
> with channelCount=1 and channelCountMode="explicit".  In this case, it's
> not clear what number should be used as the number of output channels for
> ConvolverNode.
>
> I think instead, we need to specify the number of output channels to be
> expressed as a function of the number of input channels, and the number of
> channels in the impulse response buffer, here's my proposal:
>
> Given K being the number of channels in the impulse response buffer, M the
> number of output channels will be defined as below:
>
> M = K if K = 1 or K = 2
> M = 2 if K = 4
> M = 0 otherwise
>
> This formula is compatible with all of the existing modes in <
> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Convolution-reverb-effect>,
> perhaps with the exception of true stereo, where the current spec does not
> clearly specify the down-mixing rules at the output.
>
> Does this change make sense?
>

Actually, the way I'd describe it (and the way WebKit/Blink implements it)
is that the output is hard-coded to 2-channels (stereo) very much in the
same way that PannerNode is.  We should add the text:

"The output of this node is hard-coded to stereo (2 channels) and currently
cannot be configured"

That means that the "Mono" case in the diagram currently never happens and
that "Mono to copied Stereo" is used when N=1,K=1

There's also a missing case we should probably support (and WebKit/Blink
does not) which is N=2, K=1, M=2, which means processing a stereo input
with a mono impulse response, generating stereo output.



>
> Thanks!
> --
> Ehsan
> <http://ehsanakhgari.org/>
>

Received on Thursday, 16 May 2013 21:38:13 UTC