Re: Reconciling ConvolverNode's output channel dependencies with the mixing rules in the spec from Chris Rogers on 2013-07-03 (public-audio@w3.org from July to September 2013)

From: Chris Rogers <crogers@google.com>
Date: Wed, 3 Jul 2013 13:34:01 -0700
To: Ehsan Akhgari <ehsan.akhgari@gmail.com>
Cc: "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CA+EzO0m6sFr3uP0fW6kbgcN6uCw7PhDACbyNc-W23p8=Wm33_A@mail.gmail.com>

On Wed, Jul 3, 2013 at 1:05 PM, Ehsan Akhgari <ehsan.akhgari@gmail.com>wrote:

> I don't believe that this was ever spec'ed.  Chris, do you mind editing
> the spec with the prose discussed in this thread?
>
> Thanks!
>

Sure, sorry I missed that one.


>
> --
> Ehsan
> <http://ehsanakhgari.org/>
>
>
> On Thu, May 16, 2013 at 7:05 PM, Ehsan Akhgari <ehsan.akhgari@gmail.com>wrote:
>
>> On Thu, May 16, 2013 at 5:37 PM, Chris Rogers <crogers@google.com> wrote:
>>
>>>
>>>
>>>
>>> On Thu, May 16, 2013 at 1:44 PM, Ehsan Akhgari <ehsan.akhgari@gmail.com>wrote:
>>>
>>>> Currently in the spec most nodes have a notion of input channel count,
>>>> which means that the processing code can have assumptions about the number
>>>> of input channels.  This is not the case about the number of output
>>>> channels though.
>>>>
>>>> According to <
>>>> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#UpMix>,
>>>> the number of channels that an AudioNode produces on its output doesn't
>>>> have anything to do with the number of channels that the nodes connected to
>>>> it will see in their output, as the up/down-mixing happens at the input to
>>>> each node.  In other words, the implementation of a node is free to to
>>>> choose the number of output channels that it wants without needing to worry
>>>> about what other nodes expect.  (Of course, assuming that the chosen number
>>>> of output channels makes sense, but let's grant that assumption for now.)
>>>>
>>>> ConvolverNode, however, deviates from this.  In <
>>>> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Convolution-reverb-effect>,
>>>> the spec denotes a matrix of different types of processing that needs to
>>>> happen based on the number of input channels, number of channels in the
>>>> impulse response buffer, and the number of output channels.  I find that
>>>> incompatible with the mixing rules for AudioNodes in the spec.  An
>>>> AudioNode cannot make assumptions about the number of output channels in
>>>> any meaningful way.  For example, we can connect a ConvolverNode to two
>>>> AudioNodes, one with channelCount=2 and channelCountMode="explicit" and one
>>>> with channelCount=1 and channelCountMode="explicit".  In this case, it's
>>>> not clear what number should be used as the number of output channels for
>>>> ConvolverNode.
>>>>
>>>> I think instead, we need to specify the number of output channels to be
>>>> expressed as a function of the number of input channels, and the number of
>>>> channels in the impulse response buffer, here's my proposal:
>>>>
>>>> Given K being the number of channels in the impulse response buffer, M
>>>> the number of output channels will be defined as below:
>>>>
>>>> M = K if K = 1 or K = 2
>>>> M = 2 if K = 4
>>>> M = 0 otherwise
>>>>
>>>> This formula is compatible with all of the existing modes in <
>>>> https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Convolution-reverb-effect>,
>>>> perhaps with the exception of true stereo, where the current spec does not
>>>> clearly specify the down-mixing rules at the output.
>>>>
>>>> Does this change make sense?
>>>>
>>>
>>> Actually, the way I'd describe it (and the way WebKit/Blink implements
>>> it) is that the output is hard-coded to 2-channels (stereo) very much in
>>> the same way that PannerNode is.  We should add the text:
>>>
>>> "The output of this node is hard-coded to stereo (2 channels) and
>>> currently cannot be configured"
>>>
>>> That means that the "Mono" case in the diagram currently never happens
>>> and that "Mono to copied Stereo" is used when N=1,K=1
>>>
>>
>> That sounds good to me (the Mono case should be removed from the diagram
>> as well.)
>>
>>
>>> There's also a missing case we should probably support (and WebKit/Blink
>>> does not) which is N=2, K=1, M=2, which means processing a stereo input
>>> with a mono impulse response, generating stereo output.
>>>
>>
>> Can you please spec that as well?  I'd like to implement that case by
>> up-mixing the mono impulse response buffer to stereo.
>>
>> Thanks!
>>  --
>> Ehsan
>> <http://ehsanakhgari.org/>
>>
>
>

Received on Wednesday, 3 July 2013 20:34:29 UTC