Re: Reconciling ConvolverNode's output channel dependencies with the mixing rules in the spec

On Thu, May 16, 2013 at 5:37 PM, Chris Rogers <crogers@google.com> wrote:

>
>
>
> On Thu, May 16, 2013 at 1:44 PM, Ehsan Akhgari <ehsan.akhgari@gmail.com>wrote:
>
>> Currently in the spec most nodes have a notion of input channel count,
>> which means that the processing code can have assumptions about the number
>> of input channels.  This is not the case about the number of output
>> channels though.
>>
>> According to <
>> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#UpMix>,
>> the number of channels that an AudioNode produces on its output doesn't
>> have anything to do with the number of channels that the nodes connected to
>> it will see in their output, as the up/down-mixing happens at the input to
>> each node.  In other words, the implementation of a node is free to to
>> choose the number of output channels that it wants without needing to worry
>> about what other nodes expect.  (Of course, assuming that the chosen number
>> of output channels makes sense, but let's grant that assumption for now.)
>>
>> ConvolverNode, however, deviates from this.  In <
>> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Convolution-reverb-effect>,
>> the spec denotes a matrix of different types of processing that needs to
>> happen based on the number of input channels, number of channels in the
>> impulse response buffer, and the number of output channels.  I find that
>> incompatible with the mixing rules for AudioNodes in the spec.  An
>> AudioNode cannot make assumptions about the number of output channels in
>> any meaningful way.  For example, we can connect a ConvolverNode to two
>> AudioNodes, one with channelCount=2 and channelCountMode="explicit" and one
>> with channelCount=1 and channelCountMode="explicit".  In this case, it's
>> not clear what number should be used as the number of output channels for
>> ConvolverNode.
>>
>> I think instead, we need to specify the number of output channels to be
>> expressed as a function of the number of input channels, and the number of
>> channels in the impulse response buffer, here's my proposal:
>>
>> Given K being the number of channels in the impulse response buffer, M
>> the number of output channels will be defined as below:
>>
>> M = K if K = 1 or K = 2
>> M = 2 if K = 4
>> M = 0 otherwise
>>
>> This formula is compatible with all of the existing modes in <
>> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Convolution-reverb-effect>,
>> perhaps with the exception of true stereo, where the current spec does not
>> clearly specify the down-mixing rules at the output.
>>
>> Does this change make sense?
>>
>
> Actually, the way I'd describe it (and the way WebKit/Blink implements it)
> is that the output is hard-coded to 2-channels (stereo) very much in the
> same way that PannerNode is.  We should add the text:
>
> "The output of this node is hard-coded to stereo (2 channels) and
> currently cannot be configured"
>
> That means that the "Mono" case in the diagram currently never happens and
> that "Mono to copied Stereo" is used when N=1,K=1
>

That sounds good to me (the Mono case should be removed from the diagram as
well.)


> There's also a missing case we should probably support (and WebKit/Blink
> does not) which is N=2, K=1, M=2, which means processing a stereo input
> with a mono impulse response, generating stereo output.
>

Can you please spec that as well?  I'd like to implement that case by
up-mixing the mono impulse response buffer to stereo.

Thanks!
--
Ehsan
<http://ehsanakhgari.org/>

Received on Thursday, 16 May 2013 23:06:52 UTC