Re: Reconciling ConvolverNode's output channel dependencies with the mixing rules in the spec from Chris Rogers on 2013-05-18 (public-audio@w3.org from April to June 2013)

From: Chris Rogers <crogers@google.com>
Date: Fri, 17 May 2013 22:33:30 -0700
To: Frederick Umminger <frederick.umminger@gmail.com>
Cc: Ehsan Akhgari <ehsan.akhgari@gmail.com>, "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CA+EzO0mqXc7nq3F_SE=xcsdRMONtww-TEHmUwe-UQP1Hj-0sGw@mail.gmail.com>
On Fri, May 17, 2013 at 10:15 PM, Frederick Umminger <
frederick.umminger@gmail.com> wrote:

>
> Hi Chris,
>
> On Fri, May 17, 2013 at 12:01 PM, Chris Rogers <crogers@google.com> wrote:
>>
>>
>> Most applications for convolution are for reverb which is most commonly
>> stereo.  That's why this node "comes out of the box" as stereo.  We can
>> certainly handle the general cases too.
>>
>
> I guess this depends on what area of audio you are working in. For film or
>  triple-A games I am not confident that stereo is the common case. It
> bothers me a great deal that the API is currently very stereo-centric.
>

For the web in general, the vast majority of playback systems are stereo.
 But, as I said, it's pretty easy to get multi-channel convolution today.
 You can try it if you like, since we now support multi-channel output
devices  in Chrome.


>
>
>> This is a bit limiting and inefficient in the common case of N=2, M=2
>> where we wish K=2.  The cases "Normal Stereo" and "True Stereo" are both
>> valid.
>>
>>
> I disagree that the case N=M=K=2 is either common or valid. In most real
> stereo reverbs there is cross-talk between the 2 channels ("true stereo") -
> a typical stereo reverb is not just two mono reverbs in parallel.  In any
> case, two mono reverbs in parallel can be handled as two parallel
> ConvolverNodes, which is clear and should be easy to do with the API.
>

I agree that the "true stereo" mode is better, but there are tons of free
impulse responses available for download which are stereo.  We certainly
have to support that case, since it's a common one.  That's why the
ConvolverNode is setup to work with both 2-channel and 4-channel impulse
responses, and does so automatically with no special configuration.


>
> Anyway, as I understand the spec as written, the behavior of the
> ConvolerNode is specified and required for N,M =1,2, K = 1,2,4, but it is
> left open as a possibility that other values may be supported ("In the
> general case the source has N input channels, the impulse response has K
> channels, and the playback system has M output channels."). However, the
> behavior that occurs with other values of N,M and K is not specified. That
> is asking for trouble. If other values of N,M and K are allowed, then the
> behavior needs to be precisely specified.
>

I'm going to try to cleanup the diagram and the text a bit, but I do say
"The subset of N, M, K below must be implemented (note that the first image
in the diagram is just illustrating the general case and is not normative,
while the following images are normative)."
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Convolution-reverb-effect

In other words, I'm very clear that the general case is *not* expected to
be implemented, and I then go on to add that the general case can be
handled with a merger node.



>
> There is a pretty large literature on multidimensional signal processing
> that always treats the transfer function from N inputs to M outputs as an
> N*M dimensional matrix (of functions of z). Doing anything else is a
> neologism inconsistent with the signal-processing literature. It makes the
> API harder to understand because prior experience and the wealth of
> pre-existing documentation does not apply.
>
> If N,M and K are arbitrary, then for most values with K != N*M there is no
> reasonable behavior. It may be convenient to have a special override for
> N=M=K=2, which falls outside of a general rule and requires special
> documentation, but what should happen when N=5, M=3, K=11?
>
> If N,M,K are not arbitrary, but are restricted to N,M =1,2, K = 1,2,4 or N
> =1,2, M=2, K = 1,2,4, then that is a wasted opportunity to generalize
> something that is easily generalizable, very useful and has a large
> literature. It is absolutely the case that people doing surround are going
> to want N=1,M=K=5,6,7,8.
>
> It looks to me that in order to implement a fully general N to M channel
> convolution I would need a ChannelSplitterNode to split the N channel input
> to N mono channels, N*M mono ConvolverNodes (or N*M hardcoded stereo
> ConvolverNodes and downmixed back to mono), M summing junctions to mono
> channels, and then a ChannelMergeNode to combine back to an M channel
> output. It sure would be nice to just use a single ConvolverNode instead.
>

You make it sound so complicated, but it's really just a few lines of
JavaScript.  The "summing junctions" don't even require nodes, since the
summing happens automatically by simply making multiple connections to an
input:
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#UpMix-section

It's just a few lines of code, but if the presentation of the API still
bothers you, then it's trivial to wrap this up in a JavaScript object that
hides all of these details and gives exactly the API you want.

I've designed the base-level of the API to work well, with little fuss or
muss "out of the box" for the vast majority of applications.  And for the
advanced applications, it's pretty easy to work it out.

Chris



>
> Sincerely,
>    Frederick
>
>
>
>
>
Received on Saturday, 18 May 2013 05:33:59 UTC