W3C home > Mailing lists > Public > public-audio@w3.org > July to September 2014

Re: Audio Workers - please review

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Thu, 11 Sep 2014 21:06:58 +0300
Message-ID: <CAJhzemVJDZ_3LkOxRWCgt=cA==9L7mE0E740tvbaU9Ri0cDyAA@mail.gmail.com>
To: Joseph Berkovitz <joe@noteflight.com>
Cc: Chris Wilson <cwilso@google.com>, Norbert Schnell <Norbert.Schnell@ircam.fr>, Ehsan Akhgari <ehsan@mozilla.com>, "public-audio@w3.org" <public-audio@w3.org>
On Thu, Sep 11, 2014 at 9:01 PM, Joseph Berkovitz <joe@noteflight.com>

> Jussi,
> I agree the issue of importScripts overhead could be pretty major, if this
> overhead is in fact likely to be substantial. I am not knowledgeable about
> the extent to which browsers optimize imports of the same scripts in
> multiple workers. However it is almost a certainty that nodes will want to
> exploit substantial libraries.
> I did not see a convenient way in your proposed API to make it easy for
> different nodes based on the same worker to partition their ongoing
> computational state (maintained in between onaudioprocess callbacks) from
> each other, though. Did I miss something? Doesn’t there need to be a
> persistent per-node object that can hold this state?

The state can for example be provided by a closure in the
onaudionodecreated event handler, or using a global WeakMap that uses the
AudioNodeHandles as keys. This makes sure that the state associated to the
node is garbage collected when the node is.

> …Joe
> On Sep 11, 2014, at 1:50 PM, Jussi Kalliokoski <
> jussi.kalliokoski@gmail.com> wrote:
> On Thu, Sep 11, 2014 at 7:11 PM, Chris Wilson <cwilso@google.com> wrote:
>> Actually, I believe I completely misspoke.  I believe postMessages are
>> only dispatched from a thread when the originating thread "has time" - e.g.
>> "The window.postMessage method, when called, causes a MessageEvent to be
>> dispatched at the target window when any pending script that must be
>> executed completes (e.g., remaining event handlers if window.postMessage is
>> called from an event handler, previously-set pending timeouts, etc.)" (from
>> https://developer.mozilla.org/en-US/docs/Web/API/Window.postMessage).
>>  So these would process in order, and be dispatched at the same time.
> The important question is whether they fire before or after the
> onaudioprocess. Currently that's undefined behavior and because of that
> will likely be undeterministic.
> Actually, thinking about the mention of importScripts on this thread made
> me wonder about the usability of the currently specced model. Let's say
> there's a JS audio library that contains a comprehensive set of DSP tools:
> oscillators, FFT, window functions, filters, time stretching, resampling. A
> library like this could easily weigh around the same as jQuery. Now, you
> make different kinds of custom nodes using this library, and use them in
> the similar fire-and-forget way as you generally do with the native nodes.
> Every time you create a new instance of a node like this, you fetch this
> library (cache or not), parse it and execute it. This will amount to a huge
> amount of wasted resources as well as creation delays (I'm not sure how
> importScripts could even work in the WorkerNode). The effect is amplified
> further when these nodes are compiled from another language to asm.js,
> which at the moment tends to have rather heavy a footprint. And on top of
> that, you have to create a new VM context, which can be both memory and CPU
> intensive.
> This brings me back to my earlier suggestion of allowing one worker to
> manage multiple nodes - this doesn't actually require very radical changes,
> while it does steer us further away from being compliant with normal
> Workers. Here's one proposal, that is a bit more radical but I think
> provides the necessary features as well as some little nitpick
> comprehensibility fixes on the API design.
> interface AudioNodeHandle {
>     attribute EventHandler onaudioprocess;
>     attribute EventHandler onmessage;
>     void postMessage (any message, optional sequence<Transferable>
> transfer);
>     void terminate();
> }
> interface AudioWorkerGlobalScope {
>     attribute EventHandler onaudionodecreated;
>     attribute EventHandler onmessage;
> }
> interface AudioProcessEvent : Event {
>     readonly attribute double playbackTime;
>     readonly attribute Float32Array[] inputBuffers;
>     readonly attribute Float32Array[] outputBuffers;
>     readonly attribute object parameters;
>     readonly attribute float sampleRate;
> }
> interface AudioNodeCreatedEvent : Event {
>     readonly AudioNodeHandle node;
>     readonly object data;
> }
> partial interface AudioContext {
>     AudioWorker createAudioWorker(DOMString scriptURL);
>     AudioWorkerNode createAudioWorkerNode(AudioWorker audioWorker,
> optional object options);
> }
> interface AudioWorker {
>     attribute EventHandler onmessage;
>     void postMessage (any message, optional sequence<Transferable>
> transfer);
> }
> interface AudioWorkerNode {
>     attribute EventHandler onmessage;
>     readonly attribute object parameters; // a mapping of names to
> AudioParam instances. Ideally frozen. Could be a Map-like as well with
> readonly semantics.
>     void postMessage (any message, optional sequence<Transferable>
> transfer);
> }
> (I also moved the sampleRate to the AudioProcessEvent as I think this will
> be more future-proof if we in the future figure out a way to allow
> different parts of the graph be running at different sample rates).
> Now with this model, you could do the setup once and then be able to just
> spawn instances of nodes with a massively smaller startup cost.
> In case UAs decide to implement parallelization, they can store the
> scriptURL of the AudioWorker and fork a new worker when necessary. This
> makes the parallelization observable but I don't see any new issues with
> that.
> The nit-picky API "improvement" I made with the createAudioWorkerNode was
> that it takes an options object, which contains optional values for
> numberOfInputChannels, numberOfOutputChannels (named parameters are easier
> to understand at a glance than just numbers), as well as a `parameters`
> object that has a name -> initialValue mapping, and an arbitrary data
> object to send additional initialization information to the worker, such as
> what kind of a Node it is (one worker could host multiple types of nodes).
> This would also prevent manipulating the list of audioparameters after
> creation, just like native nodes don't add or remove parameters on
> themselves after creation. A code example to clarify the usage:
> var customNode = context.createAudioWorkerNode(audioWorker, {
>     numberOfInputChannels: 1,
>     numberOfOutputChannels: 1,
>     parameters: {
>         angle: 1,
>         density: 5.2,
>     },
>     data: {
>         type: "BlackHoleGenerator",
>     },
> });
> I think since the whole point of this worker thing is performance, we
> shouldn't ignore startup performance, otherwise in a lot of cases it will
> probably be more efficient to have just one audioworker do all the
> processing and not take advantage of the graph at all, due to the high cost
> of making new nodes. We probably all agree that leading developers to that
> conclusion would be counterproductive.
>> On Thu, Sep 11, 2014 at 5:55 PM, Jussi Kalliokoski <
>> jussi.kalliokoski@gmail.com> wrote:
>>> On Thu, Sep 11, 2014 at 5:51 PM, Chris Wilson <cwilso@google.com> wrote:
>>>> I don't know how it is possible to do this, unless all WA changes are
>>>> batched up into a single postMessage.
>>> I think that would be beneficial, yes. The same applies to native nodes
>>> - in most web platform features (in fact I can't think of one exception)
>>> the things you do in a single "job" get *observably* applied at the same
>>> time, e.g. with WebGL you don't get half the scene rendered in one frame
>>> and the rest in the next one. This is the point argued in earlier
>>> discussions some time ago as well: the state of things shouldn't change on
>>> its own during a job.
>>> As for the creation of the audio context, I think the easiest solution
>>> is that we specify that the context starts playback only after the job that
>>> created it has yielded, batching up all the creation-time instructions
>>> before starting playback.
>>>> On Thu, Sep 11, 2014 at 4:41 PM, Norbert Schnell <
>>>> Norbert.Schnell@ircam.fr> wrote:
>>>>> On 11 sept. 2014, at 15:41, Chris Wilson <cwilso@google.com> wrote:
>>>>> > I think this is actually indefinite in the spec today - and needs to
>>>>> be.  "start(0)" (in fact, any "start(n)" where n is <
>>>>> audiocontext.currentTime) is catch as catch can; thread context switch may
>>>>> happen, and that needs to be okay.  Do we guarantee that:
>>>>> >
>>>>> > node1.start(0);
>>>>> > ...some really time-expensive processing steps...
>>>>> > node2.start(0);
>>>>> > will have synchronized start times?
>>>>> IMHO, it would be rather important that these two really go off at the
>>>>> same time :
>>>>> var now = audioContext.currentTime;
>>>>> node1.start(now);
>>>>> ...some really time-expensive
>>>>> node2.start(now);
>>>>> ... unless we can well define what "really time-expensive" means and
>>>>> the ability to avoid it.
>>>>> Is that actually case? I was never sure about this...
>>>>> Evidently it could be sympathetic if everything <
>>>>> audioContext.currentTime could just be clipped and behave accordingly. That
>>>>> would make things pretty clear and 0 synonymous to "now", which feels right.
>>>>> Norbert
Received on Thursday, 11 September 2014 18:10:16 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:50:14 UTC