Re: Exposing a playbackPosition property on AudioBufferSourceNode from K. Gadd on 2014-03-01 (public-audio@w3.org from January to March 2014)

From: K. Gadd <kg@luminance.org>
Date: Fri, 28 Feb 2014 20:48:07 -0800
To: Chris Wilson <cwilso@google.com>
Cc: Chinmay Pendharkar <chinmay.pendharkar@gmail.com>, "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CAPJwq3XbUHgMJSb1DJxqHP3izvBLgViXsT_e-Pp0f=p_dNqOeQ@mail.gmail.com>
Technically I believe a JS GC implementation can collect at any
time, not just when you return control to the event loop. Typical
implementations decide whether to GC any time you allocate, I believe; I would
expect this to be the case for both SpiderMonkey and V8. Since Web
Audio is a garbage-heavy API (not that it's *expensive* garbage), any
operations in your audio code could potentially trigger a small
young-generation GC, or worse, trigger a large-generation GC that
might last for dozens or hundreds of milliseconds. Those kinds of
pause times are not unheard of in desktop apps, and are common in GC
metrics from Firefox the last time I looked at them.

I had overlooked the possibility of putting currentTime in the future;
if it's far enough ahead that gives you a nice big buffer to work
with. Maybe that is enough, if browsers do it consistently - though I
wonder if you can get away with setting it that far in the future,
since that would add dramatic latency to realtime playback scenarios.

I think the 'start when you can' behavior is probably fine for certain
use cases, but in practice most scenarios I can think of care about
precise scheduling, so they probably prefer a couple dropped samples
to having everything out of alignment. Synthesized/tracked music
playback, games that sync sound effects with background music,
synchronized background tracks, etc. In those scenarios, a misaligned
piece of audio might play for multiple seconds, which will sound a lot
worse than a few clipped samples at the beginning. Arguably clipped
samples are a more debuggable side effect than desynchronized
playback, but that's pretty subjective. Ideally users won't see
either. Exposing information on the latency might let users take care
of this for good, regardless of the semantics of start(), by letting
them make sure they schedule outside the current mixer window.

On Fri, Feb 28, 2014 at 1:06 PM, Chris Wilson <cwilso@google.com> wrote:
> On Wed, Feb 26, 2014 at 2:46 PM, K. Gadd <kg@luminance.org> wrote:
>>
>> The problem I'm describing is that since all time measurement and
>>
>> arithmetic occurs on the main JS thread, JS can end up stalling due
>> to something like a GC after it records currentTime or after...
>
>
> Well, sort of.  If memory serves, GC won't fire in the middle of your
> statement, so you don't need to worry about arbitrary GC firing (unless
> you're caching the audioContext.currentTime in a global variable or some
> other such craziness;  you'd only have to worry about grabbing the
> audioContext.currentTime into a local var if you did something
> computationally quite heavy before using it.  "Don't do that" is my flip
> answer, of course :), but your point is taken.
>
>> currentTime is synchronized with the processing thread, at which point
>>
>> any attempt to synchronize two playing tracks, or record the current
>> playback position in order to pause playback, will fail because your
>> clocks are out of sync. Resuming that paused track will play a snippet
>> of audio the user already heard, and the synchronized tracks will be
>> out of sync because of that clock drift (probably doubly so, because
>> it's possible that the processing thread will get further ahead by the
>> time it gets the command asking it to start playing the new
>> synchronized track).
>
>
> Yes, it's true, trying to manage playback with no latency while doing
> computationally heavy things is going to potentially have issues.
>
>>
>> It's unfortunate to hear that start() starts 'when it can' at its
>> initial start position, because that thoroughly breaks any attempt at
>> synchronizing a currently playing track, unless you start it way in
>> the future to be sure that the processing thread won't beat you to
>> that time. I think the 'start when you can' behavior is great if no
>> 'when' is specified for starting playback. I guess this means that any
>> scenarios involving precise timing need to have commands issued to the
>> AudioContext something like 250ms in advance, to ensure that the
>> processing thread never gets ahead of you?
>
>
> No, I think that's overestimating the problem.  You don't need to account
> for all the latency possibly in the pipeline - the audioContext.currentTime
> should already be set "ahead" for that - you only need to account for the
> potential for the web audio thread to fire up and process a block while
> you're doing something computationally heavy.  That's a much smaller
> potential slippage - according to the defined block size of 128 in the spec,
> that's around 3ms at 44.1kHz.  So, you could work ahead here simply by
> scheduling everything >3ms in the future; that should mean that even if you
> miss "this" web audio processing thread block, you'll catch the next one.
> You should not need 250ms additional; at most, if you're shooting to have a
> 60fps web app you'd need your main UI thread "processing blocks" (aka
> function calls) to all complete in under 16.7ms, so at absolute worst you
> could schedule ahead by that...  (of course, now's when Raymond and the
> Mozilla gang jump in and tell me how blocks are batched, and the potential
> latency is larger.... my point is it's a fixed, relatively small amount, not
> "entire audio pipeline latency from microphone to speaker" amount like
> 250ms).
>
> I take your point about the "start when you can" behavior vs scheduling as
> close to the wire as possible - would we be better served here by cutting
> off the first bit of a scheduled-in-the-past sound, or giving more info
> about the latency/currentTime in order to let devs work around it?
>
>>
>> Both cases I describe are fairly simple race conditions, though, so I
>> think they could be fixed - I just don't know exactly how you would do
>> that with the API at present. Because we can't stall the processing
>> thread until the JS thread is ready, it's inevitable that the
>> AudioContext.currentTime value will get a bit behind the processing
>> thread, even if it's somewhat rare. In scenarios with more GC pressure
>> it seems likely that users might experience this with time gaps big
>> enough to be noticeable.
>
>
> I don't think GC pressure will cause any unpredictability here - GC only
> happens when you've yielded on the main thread, so you'd have to be caching
> that currentTime across yields (function calls fired from timeouts or
> something) - which would be a bad idea.  Your function code is what could
> cause this.  But again, if your code is running for a quarter-second at a
> time, I think you are likely to have other problems.
>
> -Chris
Received on Saturday, 1 March 2014 04:49:20 UTC