Re: Audio Workers - please review from Joseph Berkovitz on 2014-09-11 (public-audio@w3.org from July to September 2014)

From: Joseph Berkovitz <joe@noteflight.com>
Date: Thu, 11 Sep 2014 15:35:55 -0400
To: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Cc: Chris Wilson <cwilso@google.com>, Norbert Schnell <Norbert.Schnell@ircam.fr>, Ehsan Akhgari <ehsan@mozilla.com>, "public-audio@w3.org" <public-audio@w3.org>
Message-Id: <8008D065-1E2B-4904-AB63-AE71E7B4E3D9@noteflight.com>
Having had a little longer to think about your idea of multiple nodes per worker, I do see a couple of problems with it that I didn’t notice before.

- It seems to allow all nodes that share the same worker to share data freely with each other as they all share the same global scope. This encourages direct inter-node communication which can lead to dangerous assumptions by devs, although in a spotty kind of way that is limited to nodes which share a worker.

- The above data sharing would in turn force synchronous invocation of the nodes’ onaudioprocess callbacks to avoid race conditions. We do not want such a synchronous model since it will prevent parallelization of subgraphs in the future.  [I just saw that Ehsan posted with roughly the same observation.]

I still think your point about minimizing node startup costs is valid and needs to be addressed. These are just my concerns about this one particular way to go.

…Joe


On Sep 11, 2014, at 2:01 PM, Jussi Kalliokoski <jussi.kalliokoski@gmail.com> wrote:

> A small amendment to the previous proposal: no postMessage/onmessage/terminate on the AudioWorkerGlobalScope nor AudioWorker. Having those prevents numerous optimizations, such as parallelization of the graph (because if the UA can instantiate new workers of the same scriptURL at will, where would the posted message go?), not starting an AudioWorker that's not used anywhere, killing off an AudioWorker at will when it hasn't been needed for a while (similar to how ServiceWorkers are spun on and off when the UA deems them necessary), reusing the same instance between AudioContext instances that are running in the same thread and probably other things as well. Not to mention there's no use case for having the global versions of those methods.
> 
> On Thu, Sep 11, 2014 at 8:50 PM, Jussi Kalliokoski <jussi.kalliokoski@gmail.com> wrote:
> On Thu, Sep 11, 2014 at 7:11 PM, Chris Wilson <cwilso@google.com> wrote:
> Actually, I believe I completely misspoke.  I believe postMessages are only dispatched from a thread when the originating thread "has time" - e.g. "The window.postMessage method, when called, causes a MessageEvent to be dispatched at the target window when any pending script that must be executed completes (e.g., remaining event handlers if window.postMessage is called from an event handler, previously-set pending timeouts, etc.)" (from https://developer.mozilla.org/en-US/docs/Web/API/Window.postMessage).  So these would process in order, and be dispatched at the same time.
> 
> The important question is whether they fire before or after the onaudioprocess. Currently that's undefined behavior and because of that will likely be undeterministic.
> 
> Actually, thinking about the mention of importScripts on this thread made me wonder about the usability of the currently specced model. Let's say there's a JS audio library that contains a comprehensive set of DSP tools: oscillators, FFT, window functions, filters, time stretching, resampling. A library like this could easily weigh around the same as jQuery. Now, you make different kinds of custom nodes using this library, and use them in the similar fire-and-forget way as you generally do with the native nodes. Every time you create a new instance of a node like this, you fetch this library (cache or not), parse it and execute it. This will amount to a huge amount of wasted resources as well as creation delays (I'm not sure how importScripts could even work in the WorkerNode). The effect is amplified further when these nodes are compiled from another language to asm.js, which at the moment tends to have rather heavy a footprint. And on top of that, you have to create a new VM context, which can be both memory and CPU intensive.
> 
> This brings me back to my earlier suggestion of allowing one worker to manage multiple nodes - this doesn't actually require very radical changes, while it does steer us further away from being compliant with normal Workers. Here's one proposal, that is a bit more radical but I think provides the necessary features as well as some little nitpick comprehensibility fixes on the API design.
> 
> interface AudioNodeHandle {
>     attribute EventHandler onaudioprocess;
>     attribute EventHandler onmessage;
>     void postMessage (any message, optional sequence<Transferable> transfer);
>     void terminate();
> }
> 
> interface AudioWorkerGlobalScope {
>     attribute EventHandler onaudionodecreated;
>     attribute EventHandler onmessage;
> }
> 
> interface AudioProcessEvent : Event {
>     readonly attribute double playbackTime;
>     readonly attribute Float32Array[] inputBuffers;
>     readonly attribute Float32Array[] outputBuffers;
>     readonly attribute object parameters;
>     readonly attribute float sampleRate;
> }
> 
> interface AudioNodeCreatedEvent : Event {
>     readonly AudioNodeHandle node;
>     readonly object data;
> }
> 
> partial interface AudioContext {
>     AudioWorker createAudioWorker(DOMString scriptURL);
>     AudioWorkerNode createAudioWorkerNode(AudioWorker audioWorker, optional object options);
> }
> 
> interface AudioWorker {
>     attribute EventHandler onmessage;
>     void postMessage (any message, optional sequence<Transferable> transfer);
> }
> 
> interface AudioWorkerNode {
>     attribute EventHandler onmessage;
>     readonly attribute object parameters; // a mapping of names to AudioParam instances. Ideally frozen. Could be a Map-like as well with readonly semantics.
>     void postMessage (any message, optional sequence<Transferable> transfer);
> }
> 
> (I also moved the sampleRate to the AudioProcessEvent as I think this will be more future-proof if we in the future figure out a way to allow different parts of the graph be running at different sample rates).
> 
> Now with this model, you could do the setup once and then be able to just spawn instances of nodes with a massively smaller startup cost.
> 
> In case UAs decide to implement parallelization, they can store the scriptURL of the AudioWorker and fork a new worker when necessary. This makes the parallelization observable but I don't see any new issues with that.
> 
> The nit-picky API "improvement" I made with the createAudioWorkerNode was that it takes an options object, which contains optional values for numberOfInputChannels, numberOfOutputChannels (named parameters are easier to understand at a glance than just numbers), as well as a `parameters` object that has a name -> initialValue mapping, and an arbitrary data object to send additional initialization information to the worker, such as what kind of a Node it is (one worker could host multiple types of nodes). This would also prevent manipulating the list of audioparameters after creation, just like native nodes don't add or remove parameters on themselves after creation. A code example to clarify the usage:
> 
> var customNode = context.createAudioWorkerNode(audioWorker, {
>     numberOfInputChannels: 1,
>     numberOfOutputChannels: 1,
>     parameters: {
>         angle: 1,
>         density: 5.2,
>     },
>     data: {
>         type: "BlackHoleGenerator",
>     },
> });
> 
> I think since the whole point of this worker thing is performance, we shouldn't ignore startup performance, otherwise in a lot of cases it will probably be more efficient to have just one audioworker do all the processing and not take advantage of the graph at all, due to the high cost of making new nodes. We probably all agree that leading developers to that conclusion would be counterproductive.
>  
> On Thu, Sep 11, 2014 at 5:55 PM, Jussi Kalliokoski <jussi.kalliokoski@gmail.com> wrote:
> On Thu, Sep 11, 2014 at 5:51 PM, Chris Wilson <cwilso@google.com> wrote:
> I don't know how it is possible to do this, unless all WA changes are batched up into a single postMessage.
> 
> I think that would be beneficial, yes. The same applies to native nodes - in most web platform features (in fact I can't think of one exception) the things you do in a single "job" get *observably* applied at the same time, e.g. with WebGL you don't get half the scene rendered in one frame and the rest in the next one. This is the point argued in earlier discussions some time ago as well: the state of things shouldn't change on its own during a job.
> 
> As for the creation of the audio context, I think the easiest solution is that we specify that the context starts playback only after the job that created it has yielded, batching up all the creation-time instructions before starting playback.
>  
> 
> On Thu, Sep 11, 2014 at 4:41 PM, Norbert Schnell <Norbert.Schnell@ircam.fr> wrote:
> On 11 sept. 2014, at 15:41, Chris Wilson <cwilso@google.com> wrote:
> > I think this is actually indefinite in the spec today - and needs to be.  "start(0)" (in fact, any "start(n)" where n is < audiocontext.currentTime) is catch as catch can; thread context switch may happen, and that needs to be okay.  Do we guarantee that:
> >
> > node1.start(0);
> > ...some really time-expensive processing steps...
> > node2.start(0);
> > will have synchronized start times?
> 
> IMHO, it would be rather important that these two really go off at the same time :
> 
> var now = audioContext.currentTime;
> node1.start(now);
> ...some really time-expensive
> node2.start(now);
> 
> ... unless we can well define what "really time-expensive" means and the ability to avoid it.
> Is that actually case? I was never sure about this...
> 
> Evidently it could be sympathetic if everything <  audioContext.currentTime could just be clipped and behave accordingly. That would make things pretty clear and 0 synonymous to "now", which feels right.
> 
> Norbert
> 
> 
> 
> 
> 



.            .       .    .  . ...Joe

Joe Berkovitz
President

Noteflight LLC
Boston, Mass.
phone: +1 978 314 6271
www.noteflight.com
"Your music, everywhere"
Received on Thursday, 11 September 2014 19:36:34 UTC