- From: Karl Tomlinson <karlt+public-audio@karlt.net>
- Date: Mon, 26 Aug 2013 07:57:22 +1200
- To: public-audio@w3.org
I'd like to raise discussion of the desired behaviour for DelayNodes when the input channel count changes. There is a bug on file at [1]. It would be simple for implementations if they didn't have to worry too much about this situation, and could forget an existing delay buffer and start afresh when channel count changes. However, channel count changes distant in the graph may, perhaps unexpectedly, change the channel count on a delay node, so I think we may have to make an effort to handle this. Consider a graph with mono-only sources. If any stereo source is added, then most downstream nodes switch to stereo input. Is it expected that samples already received by a delay node continue to be played after the channel count changes? Assuming this is the expected behavior, taking the current wording literally "The number of channels of the output always equals the number of channels of the input" could lead to glitches as buffered streams are suddenly down-mixed because the input channel count changes. I assume up-mixing formulas ensure we don't get glitches when they are switched on, but there may not be much point in up-mixing buffered samples until they need blending with a larger number of channels. I think we need to allow the DelayNode to continue to produce a larger number of channels than its input, for at least some period. Is it necessary to specifying exactly when a DelayNode should change its number of output channels, or can we leave this to the implementation? Exactly what this might be is unclear because of the variable delay value. If the y(t) = x(t - d(t)) delay model is used (see [2]), and rates of change in delay of < -1 are permitted, then any part of the buffer may be output at a future time, and so the output channel count shouldn't drop until maxDelayTime has elapsed after input channel count change. If rates of change in delay are limited by the implementation to be >= -1, then the output channel count can be changed when the read pointer passes the position the write pointer had when the channel count changed. We can't be precise to the particular sample, as one output block per change may require some up-mixing to the maximum channel count of its buffered components. As pointed out in [1], if a delay node keeps only one buffer and the channel count changes, then there may be too much processing required to up-mix the entire buffer at once. A stereo delay buffer, of the maximum three minute length, for a 48 kHz context, may be 66 MB in size. An alternative approach is to keep old buffers after channel count changes until they are no longer required, and mix them together for the output. A downside of this approach is that we could theoretically end up with as many buffers as the maximum numbers of channels, 32 or more. That is 32 * 31 / 2 channels, which is about 16 GB if they are 3 minute uncompressed buffers. Another approach is to keep pointers to the positions in the buffer when the channel count changed, and add channels only as required. Then a 3 minute 32 channel uncompressed buffer would require only 1 GB ;). [1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=21426 [2] http://lists.w3.org/Archives/Public/public-audio/2013JulSep/0568.html
Received on Sunday, 25 August 2013 19:58:32 UTC