Re: Audio Workers - please review

On Thu, Sep 11, 2014 at 7:11 PM, Chris Wilson <cwilso@google.com> wrote:

> Actually, I believe I completely misspoke.  I believe postMessages are
> only dispatched from a thread when the originating thread "has time" - e.g.
> "The window.postMessage method, when called, causes a MessageEvent to be
> dispatched at the target window when any pending script that must be
> executed completes (e.g., remaining event handlers if window.postMessage is
> called from an event handler, previously-set pending timeouts, etc.)" (from
> https://developer.mozilla.org/en-US/docs/Web/API/Window.postMessage).  So
> these would process in order, and be dispatched at the same time.
>

The important question is whether they fire before or after the
onaudioprocess. Currently that's undefined behavior and because of that
will likely be undeterministic.

Actually, thinking about the mention of importScripts on this thread made
me wonder about the usability of the currently specced model. Let's say
there's a JS audio library that contains a comprehensive set of DSP tools:
oscillators, FFT, window functions, filters, time stretching, resampling. A
library like this could easily weigh around the same as jQuery. Now, you
make different kinds of custom nodes using this library, and use them in
the similar fire-and-forget way as you generally do with the native nodes.
Every time you create a new instance of a node like this, you fetch this
library (cache or not), parse it and execute it. This will amount to a huge
amount of wasted resources as well as creation delays (I'm not sure how
importScripts could even work in the WorkerNode). The effect is amplified
further when these nodes are compiled from another language to asm.js,
which at the moment tends to have rather heavy a footprint. And on top of
that, you have to create a new VM context, which can be both memory and CPU
intensive.

This brings me back to my earlier suggestion of allowing one worker to
manage multiple nodes - this doesn't actually require very radical changes,
while it does steer us further away from being compliant with normal
Workers. Here's one proposal, that is a bit more radical but I think
provides the necessary features as well as some little nitpick
comprehensibility fixes on the API design.

interface AudioNodeHandle {
    attribute EventHandler onaudioprocess;
    attribute EventHandler onmessage;
    void postMessage (any message, optional sequence<Transferable>
transfer);
    void terminate();
}

interface AudioWorkerGlobalScope {
    attribute EventHandler onaudionodecreated;
    attribute EventHandler onmessage;
}

interface AudioProcessEvent : Event {
    readonly attribute double playbackTime;
    readonly attribute Float32Array[] inputBuffers;
    readonly attribute Float32Array[] outputBuffers;
    readonly attribute object parameters;
    readonly attribute float sampleRate;
}

interface AudioNodeCreatedEvent : Event {
    readonly AudioNodeHandle node;
    readonly object data;
}

partial interface AudioContext {
    AudioWorker createAudioWorker(DOMString scriptURL);
    AudioWorkerNode createAudioWorkerNode(AudioWorker audioWorker, optional
object options);
}

interface AudioWorker {
    attribute EventHandler onmessage;
    void postMessage (any message, optional sequence<Transferable>
transfer);
}

interface AudioWorkerNode {
    attribute EventHandler onmessage;
    readonly attribute object parameters; // a mapping of names to
AudioParam instances. Ideally frozen. Could be a Map-like as well with
readonly semantics.
    void postMessage (any message, optional sequence<Transferable>
transfer);
}

(I also moved the sampleRate to the AudioProcessEvent as I think this will
be more future-proof if we in the future figure out a way to allow
different parts of the graph be running at different sample rates).

Now with this model, you could do the setup once and then be able to just
spawn instances of nodes with a massively smaller startup cost.

In case UAs decide to implement parallelization, they can store the
scriptURL of the AudioWorker and fork a new worker when necessary. This
makes the parallelization observable but I don't see any new issues with
that.

The nit-picky API "improvement" I made with the createAudioWorkerNode was
that it takes an options object, which contains optional values for
numberOfInputChannels, numberOfOutputChannels (named parameters are easier
to understand at a glance than just numbers), as well as a `parameters`
object that has a name -> initialValue mapping, and an arbitrary data
object to send additional initialization information to the worker, such as
what kind of a Node it is (one worker could host multiple types of nodes).
This would also prevent manipulating the list of audioparameters after
creation, just like native nodes don't add or remove parameters on
themselves after creation. A code example to clarify the usage:

var customNode = context.createAudioWorkerNode(audioWorker, {
    numberOfInputChannels: 1,
    numberOfOutputChannels: 1,
    parameters: {
        angle: 1,
        density: 5.2,
    },
    data: {
        type: "BlackHoleGenerator",
    },
});

I think since the whole point of this worker thing is performance, we
shouldn't ignore startup performance, otherwise in a lot of cases it will
probably be more efficient to have just one audioworker do all the
processing and not take advantage of the graph at all, due to the high cost
of making new nodes. We probably all agree that leading developers to that
conclusion would be counterproductive.


> On Thu, Sep 11, 2014 at 5:55 PM, Jussi Kalliokoski <
> jussi.kalliokoski@gmail.com> wrote:
>
>> On Thu, Sep 11, 2014 at 5:51 PM, Chris Wilson <cwilso@google.com> wrote:
>>
>>> I don't know how it is possible to do this, unless all WA changes are
>>> batched up into a single postMessage.
>>>
>>
>> I think that would be beneficial, yes. The same applies to native nodes -
>> in most web platform features (in fact I can't think of one exception) the
>> things you do in a single "job" get *observably* applied at the same time,
>> e.g. with WebGL you don't get half the scene rendered in one frame and the
>> rest in the next one. This is the point argued in earlier discussions some
>> time ago as well: the state of things shouldn't change on its own during a
>> job.
>>
>> As for the creation of the audio context, I think the easiest solution is
>> that we specify that the context starts playback only after the job that
>> created it has yielded, batching up all the creation-time instructions
>> before starting playback.
>>
>>
>>>
>>> On Thu, Sep 11, 2014 at 4:41 PM, Norbert Schnell <
>>> Norbert.Schnell@ircam.fr> wrote:
>>>
>>>> On 11 sept. 2014, at 15:41, Chris Wilson <cwilso@google.com> wrote:
>>>> > I think this is actually indefinite in the spec today - and needs to
>>>> be.  "start(0)" (in fact, any "start(n)" where n is <
>>>> audiocontext.currentTime) is catch as catch can; thread context switch may
>>>> happen, and that needs to be okay.  Do we guarantee that:
>>>> >
>>>> > node1.start(0);
>>>> > ...some really time-expensive processing steps...
>>>> > node2.start(0);
>>>> > will have synchronized start times?
>>>>
>>>> IMHO, it would be rather important that these two really go off at the
>>>> same time :
>>>>
>>>> var now = audioContext.currentTime;
>>>> node1.start(now);
>>>> ...some really time-expensive
>>>> node2.start(now);
>>>>
>>>> ... unless we can well define what "really time-expensive" means and
>>>> the ability to avoid it.
>>>> Is that actually case? I was never sure about this...
>>>>
>>>> Evidently it could be sympathetic if everything <
>>>> audioContext.currentTime could just be clipped and behave accordingly. That
>>>> would make things pretty clear and 0 synonymous to "now", which feels right.
>>>>
>>>> Norbert
>>>
>>>
>>>
>>
>

Received on Thursday, 11 September 2014 17:50:50 UTC