Re: Starting from Chris Rogers on 2013-05-07 (public-audio@w3.org from April to June 2013)

From: Chris Rogers <crogers@google.com>
Date: Mon, 6 May 2013 18:11:27 -0700
To: Ehsan Akhgari <ehsan.akhgari@gmail.com>
Cc: Joseph Berkovitz <joe@noteflight.com>, Stuart Memo <stuartmemo@gmail.com>, "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CA+EzO0ndM_MjR0GoEtMbbC_ybPZ3PX+ZQgFc9E-hA7zLM5a=AA@mail.gmail.com>
On Mon, May 6, 2013 at 5:52 PM, Ehsan Akhgari <ehsan.akhgari@gmail.com>wrote:

> Sorry for the delay in my response here!
>
> On Tue, Apr 23, 2013 at 5:42 PM, Joseph Berkovitz <joe@noteflight.com>wrote:
>
>> Hi Ehsan,
>>
>> Please take a look at my response and pseudocode below regarding this
>> point...
>>
>> "The time when the audio will be played in the same time coordinate
>>> system as AudioContext.currentTime. playbackTime allows for very
>>> tight synchronization between processing directly in JavaScript with the
>>> other events in the context's rendering graph."
>>>
>>> I believe that this leaves no room for playbackTime to be inaccurate.
>>> The value of playbackTime in an AudioProcessEvent must exactly equal the
>>> time T at which a sound scheduled with node.start(T) would be played
>>> simultaneously with the first frame of the AudioProcessEvent's sample block.
>>>
>>> I have not yet experimented with playbackTime in Gecko yet, but I
>>> originally proposed the feature for inclusion in the spec and the above
>>> definition is how it needs to work if it's to be useful for synchronization.
>>>
>>
>> You're right about the current text in the spec, but we should probably
>> change it since what you're asking for is pretty much impossible to
>> implement.  Imagine this scenario: let's say that the ScriptProcessorNode
>> wants to dispatch an event with a properly calculated playbackTime.  Let's
>> say that the event handler looks like this:
>>
>> function handleEvent(event) {
>>   // assume that AudioContext.currentTime can change its value without
>> hitting the event loop
>>   while (event.playbackTime < event.target.context.currentTime);
>> }
>>
>> Such an event handler would just wait until playbackTime is passed, and
>> then return, and therefore it would make it impossible for the
>> ScriptProcessorNode to operate without latency.
>>
>>
>> That is not the way that one would make use of event.playbackTime in a
>> ScriptProcessorNode. As you say, looping inside an event handler like this
>> makes no sense and will wreck the operation of the system.
>>
>> The sole purpose of event.playbackTime is to let the code inside the
>> event handler know at what time the samples that it generates will be
>> played.  Not only is this not impossible to implement, it's quite
>> practical, since it's what any "schedulable" source like Oscillators and
>> AudioBufferSourceNodes must do under the hood.
>>
>> Here's how it's intended to be used: Going back to pseudocode, let's say
>> you want to start both an Oscillator and some noise starting at some time
>> T… in mono...
>>
>> var oscillator = context.createOscillator();
>> // ...also configure the oscillator...
>> oscillator.connect(context.destination);
>> oscillator.start(T);
>>
>> var processor = context.createScriptProcessor();
>> processor.connect(context.destination);
>> processor.onprocessaudio = function(event) {
>>   for (var i = 0..processor.bufferSize) {
>>     var sampleTime = event.playbackTime + (i *
>> event.outputBuffer.sampleRate);
>>     if (sampleTime >= T)
>>         event.outputBuffer.getChannelData(0)[i] = Math.random();
>>     else
>>         event.outputBuffer.getChannelData(0)[i]  = 0;
>>   }
>> }
>>
>> There is in fact no other reliable mechanism in the API for script nodes
>> to synchronize their output with "schedulable" sources, which is why this
>> got into the spec in the first place.
>>
>
> I hope that there is now less confusion about this after last week's
> teleconf, but allow me to clarify things a bit.
>
> ScriptProcessorNode buffers its input and only dispatches the audioprocess
> event when a buffer of bufferSize samples has been filled up, so in the
> best case, each ScriptProcessorNode in the graph adds bufferSize/sampleRate
> seconds of delay.  Now, when the implementation wants to dispatch the
> audioprocess event, it needs to calculate the playbackTime value.  Note
> that at this point, the implementation doesn't know how long it's going to
> take for the event to be handled, so roughly speaking it calculates
> playbackTime to be equal to currentTime + bufferSize/sampleRate.  This is
> in practice a guess on part of the implementation that the event handling
> will be finished very soon with a negligible delay.  Now, let's for the
> sake of this example say that the web page takes 100ms to handle the
> event.  Once the event dispatch is complete, we're not 100ms late to
> playback the outputBuffer, which means that the buffer will be played back
> at currentTime + bufferSize/sampleRate + 0.1 *at best*.  Now, a good
> implementation can remember this delay, and the next time calculate
> playbackTime to be currentTime + bufferSize/sampleRate + 0.1, and basically
> accumulate all of the delays seen in dispatching the previous events and
> adjust its estimate of playbackTime every time it fires an audioprocess
> event.  But unless the implementation can know how long the event handling
> phase will take it can never calculate an accurate playbackTime, simply
> because it cannot foresee the future!
>

Actually, I'm quite sure it can exactly calculate this value, but I'd
rather discuss it in the meeting, since I fear it might be too complicated
to explain quickly right now.


>
> Now, let's talk about what this means in practice.  Take this test case,
> which simply generates a sine wave using ScriptProcessorNode: <
> https://bugzilla.mozilla.org/attachment.cgi?id=738313>.  Currently
> WebKit/Blink use a double buffering approach, and Gecko uses a buffer
> queue, which means that the WebKit/Blink implementation will suffer more
> from delays incurred when handling the audioprocess event.  If you try this
> test case in Chrome, you'll see that the playback consistently glitches.
> The glitching behavior should be a lot better in Firefox since we simply
> buffer more input data to be able to recover from delays sooner, but there
> are limitations on how good we can be, and I believe that the current
> Firefox implementation is quite close to how good ScriptProcessorNode can
> be implemented.  With this fundamental problem, I'm worried that
> ScriptProcessorNode as currently specified is not really usable for audio
> generation (it can of course be used to inspect incoming frames, but that's
> a different use case), so in a way, the whole problem of how to implement
> playbackTime is the least of my worries.
>
> Cheers,
> --
> Ehsan
> <http://ehsanakhgari.org/>
>
Received on Tuesday, 7 May 2013 01:11:54 UTC