Re: Determining Output Latency

I think we need to know the *absolute* system time to which currentTime corresponds, *including* any latency. Here's why -

An application that needs to visualize audio events that does its drawing in a requestAnimationFrame() callback needs to know the relationship between AudioContext.currentTime and the time stamp passed to the rAF callback. We do not have a way to determine the system time at which the audio clock started and the system clock's rate may be different from the audio clock's rate on some devices anyway. So this relationship needs to be estimated using a linear model, at least.

The absence of a known relationship in this context can be compensated for if the audio system uses a small enough buffer size. Given an audio sampling rate of 44100Hz and a display refresh rate of 60Hz, we expect 735 samples to pass during a visual frame. So if the audio buffer size is <= 256 samples, the drawing code can get away with using currentTime as its draw time, perhaps shifting it by one buffer duration into the future. If the buffer size is any larger, you'd get serious temporal aliasing effects if you use currentTime for this purpose.[1] A buffer size of 512 samples is therefore not good enough for this purpose and low power devices may need to use even longer buffers like 2048 to serve continuous audio.

Only providing a "latency" value would not solve this "visual sampling" problem, and the original T+L problem presented in this thread would exist even if the client is using a very small buffer size. Both these problems can be solved if we know the absolute system time at which samples time stamped with currentTime will go out.

Thoughts?

Best,
-Kumar

[1] You'd still get some temporal jitter with 256, but it may be tolerable.

On 17 Jan, 2013, at 1:51 AM, Chris Rogers <crogers@google.com> wrote:

> 
> 
> On Wed, Jan 16, 2013 at 12:16 PM, Joseph Berkovitz <joe@noteflight.com> wrote:
> Thanks, Chris. This is certainly one way to handily solve the problem. I also wonder whether an offset, rather than an absolute time, might be cleaner. I don't have a strong preference.
> 
> The offset, I think, is what I'm suggesting with the AudioContext.presentationLatency attribute.  It would be in seconds (hopefully small in most cases :)
> 
> Chris
>  
> 
> I neglected to give a use case but for completeness, here it is. If one needs to display a visual cursor in relationship to some onscreen representation of an audio timeline (e.g. a cursor on top of music notation or DAW clips) then knowing the performance.now() coordinates for what is coming out of the speakers is essential. Otherwise the display tells a lie.
> 
> (This assumes that there is essentially zero latency in the visual display but, that seems reasonable.)
> 
> …Joe
> 
> On Jan 16, 2013, at 2:05 PM, Chris Wilson <cwilso@google.com> wrote:
> 
>> Chris and I were talking about just this problem the other day, as I was exploring synchronizing Web Audio events with other events (notably, of course, MIDI) that are on the performance.now() clock.  After writing the Web Audio scheduling article for HTML5Rocks (http://www.html5rocks.com/en/tutorials/audio/scheduling/), I realized that it's problematic to incorporate scheduling other real-time events (even knowing precisely "what time it is" from the drawing function) without a better understanding of the latency.
>> 
>> The idea we reached (I think Chris proposed it, but I can't honestly remember) was to have a performance.now()-reference clock time on AudioContext that would tell you when the AudioContext.currentTime was taken (or when that time will occur, if it's in the future); that would allow you to synchronize the two clocks.  The more I've thought about it, the more I quite like this approach - having something like AudioContext.currentSystemTime in window.performance.now()-reference.
>> 
>> On Wed, Jan 16, 2013 at 9:51 AM, Joseph Berkovitz <joe@noteflight.com> wrote:
>> Hi Chris,
>> 
>> It's become apparent that on some devices and Web Audio implementations, an AudioContext's currentTime reports a time that is somewhat ahead of the time of the actual audio signal emerging from the device, by a fixed amount.  To be more specific, if a sound is scheduled (even very far in advance) to be played at time T, the sound will actually be played when AudioContext.currentTime = T + L where L is a fixed number which for the purposes of this email I'll call "output latency".
>> 
>> I think the only reason this hasn't been noticed before is that until recently, the output latency on the implementations that I've been exposed to has been too small to notice. But in some implementations it can be substantial and noticeable.
>> 
>> When this occurs, is this 1) a problem with the implementation of the spec, or 2) an anticipated phenomenon that may vary from one implementation to another?
>> 
>> If the answer is 1), then at a minimum the spec needs to clarify the meaning of context.currentTime with respect to physical audio playback so that implementors realize they must add L back into the reported value of currentTime to make it correct.  But if the answer is 2), then we have a different problem: there doesn't appear to be any way to interrogate the API to determine the value of L on any particular platform.
>> 
>> Can you or others on the list provide any guidance on this point? Should I file a bug and, if so, what for?
>> 
>> Best,
>> 
>> ... .  .    .       Joe
>> 
>> Joe Berkovitz
>> President
>> 
>> Noteflight LLC
>> Boston, Mass.phone: +1 978 314 6271
>> www.noteflight.com
>> "Your music, everywhere"
>> 
>> 
> 
> 
> ... .  .    .       Joe
> 
> Joe Berkovitz
> President
> 
> Noteflight LLC
> Boston, Mass. phone: +1 978 314 6271
> www.noteflight.com
> "Your music, everywhere"
> 
> 

Received on Wednesday, 23 January 2013 00:57:11 UTC