W3C home > Mailing lists > Public > public-audio@w3.org > July to September 2014

Re: Audio Workers - please review

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Fri, 12 Sep 2014 00:44:12 +0300
Message-ID: <CAJhzemVky7avXRXYZMCY5BA5aCCv2LxtdTnpBAHzFPTbL_+oDQ@mail.gmail.com>
To: Joseph Berkovitz <joe@noteflight.com>
Cc: Chris Wilson <cwilso@google.com>, Norbert Schnell <Norbert.Schnell@ircam.fr>, Ehsan Akhgari <ehsan@mozilla.com>, "public-audio@w3.org" <public-audio@w3.org>
On Thu, Sep 11, 2014 at 10:35 PM, Joseph Berkovitz <joe@noteflight.com>

> Having had a little longer to think about your idea of multiple nodes per
> worker, I do see a couple of problems with it that I didn’t notice before.
> - It seems to allow all nodes that share the same worker to share data
> freely with each other as they all share the same global scope. This
> encourages direct inter-node communication which can lead to dangerous
> assumptions by devs, although in a spotty kind of way that is limited to
> nodes which share a worker.

True, except for the dangerous assumptions by devs. If you have examples of
such assumptions that weren't addressed by my reply to Ehsan, I'm intrigued
to hear.

I have a hard time coming up with any useful scenario where the inter-node
communication would be beneficial, other than sharing temporary buffers
which, given JS' single-threaded nature, is not dangerous, but actually yet
another performance benefit compared to the currently specced proposal.

> - The above data sharing would in turn force synchronous invocation of the
> nodes’ onaudioprocess callbacks to avoid race conditions. We do not want
> such a synchronous model since it will prevent parallelization of subgraphs
> in the future.  [I just saw that Ehsan posted with roughly the same
> observation.]

And thus, see my reply to Ehsan. :) Nothing in my proposal would prevent
such optimization.

> I still think your point about minimizing node startup costs is valid and
> needs to be addressed. These are just my concerns about this one particular
> way to go.
> …Joe
> On Sep 11, 2014, at 2:01 PM, Jussi Kalliokoski <
> jussi.kalliokoski@gmail.com> wrote:
> A small amendment to the previous proposal: no
> postMessage/onmessage/terminate on the AudioWorkerGlobalScope nor
> AudioWorker. Having those prevents numerous optimizations, such as
> parallelization of the graph (because if the UA can instantiate new workers
> of the same scriptURL at will, where would the posted message go?), not
> starting an AudioWorker that's not used anywhere, killing off an
> AudioWorker at will when it hasn't been needed for a while (similar to how
> ServiceWorkers are spun on and off when the UA deems them necessary),
> reusing the same instance between AudioContext instances that are running
> in the same thread and probably other things as well. Not to mention
> there's no use case for having the global versions of those methods.
> On Thu, Sep 11, 2014 at 8:50 PM, Jussi Kalliokoski <
> jussi.kalliokoski@gmail.com> wrote:
>> On Thu, Sep 11, 2014 at 7:11 PM, Chris Wilson <cwilso@google.com> wrote:
>>> Actually, I believe I completely misspoke.  I believe postMessages are
>>> only dispatched from a thread when the originating thread "has time" - e.g.
>>> "The window.postMessage method, when called, causes a MessageEvent to be
>>> dispatched at the target window when any pending script that must be
>>> executed completes (e.g., remaining event handlers if window.postMessage is
>>> called from an event handler, previously-set pending timeouts, etc.)" (from
>>> https://developer.mozilla.org/en-US/docs/Web/API/Window.postMessage).
>>>  So these would process in order, and be dispatched at the same time.
>> The important question is whether they fire before or after the
>> onaudioprocess. Currently that's undefined behavior and because of that
>> will likely be undeterministic.
>> Actually, thinking about the mention of importScripts on this thread made
>> me wonder about the usability of the currently specced model. Let's say
>> there's a JS audio library that contains a comprehensive set of DSP tools:
>> oscillators, FFT, window functions, filters, time stretching, resampling. A
>> library like this could easily weigh around the same as jQuery. Now, you
>> make different kinds of custom nodes using this library, and use them in
>> the similar fire-and-forget way as you generally do with the native nodes.
>> Every time you create a new instance of a node like this, you fetch this
>> library (cache or not), parse it and execute it. This will amount to a huge
>> amount of wasted resources as well as creation delays (I'm not sure how
>> importScripts could even work in the WorkerNode). The effect is amplified
>> further when these nodes are compiled from another language to asm.js,
>> which at the moment tends to have rather heavy a footprint. And on top of
>> that, you have to create a new VM context, which can be both memory and CPU
>> intensive.
>> This brings me back to my earlier suggestion of allowing one worker to
>> manage multiple nodes - this doesn't actually require very radical changes,
>> while it does steer us further away from being compliant with normal
>> Workers. Here's one proposal, that is a bit more radical but I think
>> provides the necessary features as well as some little nitpick
>> comprehensibility fixes on the API design.
>> interface AudioNodeHandle {
>>     attribute EventHandler onaudioprocess;
>>     attribute EventHandler onmessage;
>>     void postMessage (any message, optional sequence<Transferable>
>> transfer);
>>     void terminate();
>> }
>> interface AudioWorkerGlobalScope {
>>     attribute EventHandler onaudionodecreated;
>>     attribute EventHandler onmessage;
>> }
>> interface AudioProcessEvent : Event {
>>     readonly attribute double playbackTime;
>>     readonly attribute Float32Array[] inputBuffers;
>>     readonly attribute Float32Array[] outputBuffers;
>>     readonly attribute object parameters;
>>     readonly attribute float sampleRate;
>> }
>> interface AudioNodeCreatedEvent : Event {
>>     readonly AudioNodeHandle node;
>>     readonly object data;
>> }
>> partial interface AudioContext {
>>     AudioWorker createAudioWorker(DOMString scriptURL);
>>     AudioWorkerNode createAudioWorkerNode(AudioWorker audioWorker,
>> optional object options);
>> }
>> interface AudioWorker {
>>     attribute EventHandler onmessage;
>>     void postMessage (any message, optional sequence<Transferable>
>> transfer);
>> }
>> interface AudioWorkerNode {
>>     attribute EventHandler onmessage;
>>     readonly attribute object parameters; // a mapping of names to
>> AudioParam instances. Ideally frozen. Could be a Map-like as well with
>> readonly semantics.
>>     void postMessage (any message, optional sequence<Transferable>
>> transfer);
>> }
>> (I also moved the sampleRate to the AudioProcessEvent as I think this
>> will be more future-proof if we in the future figure out a way to allow
>> different parts of the graph be running at different sample rates).
>> Now with this model, you could do the setup once and then be able to just
>> spawn instances of nodes with a massively smaller startup cost.
>> In case UAs decide to implement parallelization, they can store the
>> scriptURL of the AudioWorker and fork a new worker when necessary. This
>> makes the parallelization observable but I don't see any new issues with
>> that.
>> The nit-picky API "improvement" I made with the createAudioWorkerNode was
>> that it takes an options object, which contains optional values for
>> numberOfInputChannels, numberOfOutputChannels (named parameters are easier
>> to understand at a glance than just numbers), as well as a `parameters`
>> object that has a name -> initialValue mapping, and an arbitrary data
>> object to send additional initialization information to the worker, such as
>> what kind of a Node it is (one worker could host multiple types of nodes).
>> This would also prevent manipulating the list of audioparameters after
>> creation, just like native nodes don't add or remove parameters on
>> themselves after creation. A code example to clarify the usage:
>> var customNode = context.createAudioWorkerNode(audioWorker, {
>>     numberOfInputChannels: 1,
>>     numberOfOutputChannels: 1,
>>     parameters: {
>>         angle: 1,
>>         density: 5.2,
>>     },
>>     data: {
>>         type: "BlackHoleGenerator",
>>     },
>> });
>> I think since the whole point of this worker thing is performance, we
>> shouldn't ignore startup performance, otherwise in a lot of cases it will
>> probably be more efficient to have just one audioworker do all the
>> processing and not take advantage of the graph at all, due to the high cost
>> of making new nodes. We probably all agree that leading developers to that
>> conclusion would be counterproductive.
>>> On Thu, Sep 11, 2014 at 5:55 PM, Jussi Kalliokoski <
>>> jussi.kalliokoski@gmail.com> wrote:
>>>> On Thu, Sep 11, 2014 at 5:51 PM, Chris Wilson <cwilso@google.com>
>>>> wrote:
>>>>> I don't know how it is possible to do this, unless all WA changes are
>>>>> batched up into a single postMessage.
>>>> I think that would be beneficial, yes. The same applies to native nodes
>>>> - in most web platform features (in fact I can't think of one exception)
>>>> the things you do in a single "job" get *observably* applied at the same
>>>> time, e.g. with WebGL you don't get half the scene rendered in one frame
>>>> and the rest in the next one. This is the point argued in earlier
>>>> discussions some time ago as well: the state of things shouldn't change on
>>>> its own during a job.
>>>> As for the creation of the audio context, I think the easiest solution
>>>> is that we specify that the context starts playback only after the job that
>>>> created it has yielded, batching up all the creation-time instructions
>>>> before starting playback.
>>>>> On Thu, Sep 11, 2014 at 4:41 PM, Norbert Schnell <
>>>>> Norbert.Schnell@ircam.fr> wrote:
>>>>>> On 11 sept. 2014, at 15:41, Chris Wilson <cwilso@google.com> wrote:
>>>>>> > I think this is actually indefinite in the spec today - and needs
>>>>>> to be.  "start(0)" (in fact, any "start(n)" where n is <
>>>>>> audiocontext.currentTime) is catch as catch can; thread context switch may
>>>>>> happen, and that needs to be okay.  Do we guarantee that:
>>>>>> >
>>>>>> > node1.start(0);
>>>>>> > ...some really time-expensive processing steps...
>>>>>> > node2.start(0);
>>>>>> > will have synchronized start times?
>>>>>> IMHO, it would be rather important that these two really go off at
>>>>>> the same time :
>>>>>> var now = audioContext.currentTime;
>>>>>> node1.start(now);
>>>>>> ...some really time-expensive
>>>>>> node2.start(now);
>>>>>> ... unless we can well define what "really time-expensive" means and
>>>>>> the ability to avoid it.
>>>>>> Is that actually case? I was never sure about this...
>>>>>> Evidently it could be sympathetic if everything <
>>>>>> audioContext.currentTime could just be clipped and behave accordingly. That
>>>>>> would make things pretty clear and 0 synonymous to "now", which feels right.
>>>>>> Norbert
> .            .       .    .  . ...Joe
> *Joe Berkovitz*
> President
> *Noteflight LLC*
> Boston, Mass.
> phone: +1 978 314 6271
> www.noteflight.com
> "Your music, everywhere"
Received on Thursday, 11 September 2014 21:44:40 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:50:14 UTC