Re: Audio Workers - please review from Jussi Kalliokoski on 2014-09-11 (public-audio@w3.org from July to September 2014)

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Thu, 11 Sep 2014 15:55:53 +0300
To: Chris Wilson <cwilso@google.com>
Cc: Ehsan Akhgari <ehsan@mozilla.com>, "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CAJhzemULsZ69ZqbEp3m4Qw42hi6JvO938T_Vvb8yqepniZ_1zQ@mail.gmail.com>
Good work on the spec, Chris!

One thing I have been thinking about is that should we specify extra
scheduling orders for postMessage? Currently, the spec just refers to the
postMessage algorithm described in the Worker spec, but I'm not sure that's
completely workable.

To approach this from the perspective of self-hosting the built-in nodes
(mostly because you'd expect custom nodes to work in a similar way), let's
say:

* We have a self-hosted Oscillator node.
* The user code calls the start(0) method on it.
* This triggers a postMessage() to the worker, telling it to start
playback, at audio context's clock time 0.

What happens next is unclear. What I think most people would expect is that
like with the native implementation of the oscillator, the playback would
start at time zero, but most likely, with the current unspecified order of
things what would probably actually happen is that getting the first buffer
out would take priority over the message (because postMessage is async and
onaudioprocess is sync) and the oscillator would start one buffer late. The
first buffer is the most vulnerable situation, because after that
scheduling for now can possibly fail to catch the train of the next playing
buffer for the native nodes as well. I haven't come up with any clear
solutions for this, but I think whatever solution we take, we should take
care in making sure that the functionality is consistent with the native
nodes. While we're at it, we should probably make sure that if you do
play(now) for two or more different nodes in the same "job" (i.e. before
yielding back to the event loop) of the main thread, they both start
playback at the same time, both for the native nodes *and* the audio
workers.

Speaking of consistency with native nodes, I think that while terminate()
is specified on the worker instance quite as I'd expect, the worker needs
to be able to terminate itself as well. This is to make sure that the
custom nodes can implement fire-and-forget behavior similar to the native
nodes, e.g. "I've nothing else to play, throw me away".

- Jussi

On Thu, Sep 11, 2014 at 1:03 PM, Chris Wilson <cwilso@google.com> wrote:

> On Wed, Sep 10, 2014 at 6:29 PM, Ehsan Akhgari <ehsan@mozilla.com> wrote:
>
>> I took a look at the merged version of this proposal and have some
>> feedback, I would appreciate if you can please let me know what you think.
>> As you know, I haven't been closely involved with the spec for a while, so
>> if the below reflects my ignorance on some of the previously discussed
>> matters, please accept my apologies in advance.
>>
>
> Not at all.  This is a fairly radical proposal, so I welcome discussion.
>
>> a) It is not really clear what should happen after you call terminate()
>> on AudioWorkerNode, since that seems to be orthogonal to whether the node
>> participates in the graph.
>>
>
> Noted.  I added a bit of text to further describe this.  In short: no, it
> no longer participates in the graph.
>
>
>> b) The interaction between worker closing/termination and the audio
>> processing is not clear at all.  Specifically, dedicated workers typically
>> have a long time that is as long as the last pending task (assuming that
>> the main thread doesn't hold a reference to them), however, for these
>> workers, we need to keep firing audioprocess events.
>>
>
> I'm not sure why that is true.  The spec explicitly says when terminate()
> is called, that will cease firing of onaudioprocess.
>
>
>> c) How would different AudioWorkerNodes allocated through different
>> AudioContext objects interact with each other?  (Here is an example of
>> something that is not clear, should it be allowed to send an AudioParam
>> belonging to a node form context 1 to a worker node from context 2?  The
>> spec is currently silent about this.)
>>
>
> You are correct, it in.  In general, the spec is silent on how AudioNodes
> created from different AudioContext objects interact with each other.  I've
> known this was an issue, but had not ensured that we had an issue filed.
>  Now we do: https://github.com/WebAudio/web-audio-api/issues/351.  I
> don't think this is specific to Audio Workers.
>
>
>> d) What is the exact semantics of the audioprocess events dispatched to
>> these workers?  Do they block running all audio processing on the audio
>> processing thread?
>>
>
> Yes.
>
>
>> Note that because of the halting problem, we cannot even verify that the
>> worker script will ever terminate, let alone in a reasonable amount of
>> time, so in practice the UAs need to preempt the execution of the script.
>>
>
> Probably, as Alex mentioned - but I would expect the audio system to
> glitch in the meantime.  The point is to enable synchronous processing, and
> get rid of the inherent latency in script processing.
>
>
>> It would be nice if this was somehow mentioned in the spec, at least in
>> terms of what the UA needs to do if it decides to abort the execution of
>> one of these scripts.
>>
>
> I'm not sure what to translate this to, in practical terms.
>
>
>> e) What is the order in which these worker nodes receive the audioprocess
>> events (assuming obviously that one such node is not an indirect
>> input/output of the other)?  Note that with the currently specified API, I
>> think it is possible for one of these workers to send a MessagePort through
>> the main thread to another one, and therefore be able to communicate with
>> the other AudioWorkerNode workers through MessagePort.postMessage, so the
>> order of execution is observable to script.  (I admit that I don't have a
>> good solution for this -- but I also don't know what use cases for
>> transferring information back and forth between these nodes and the main
>> thread we're trying to address here.)
>>
>
> I don't think this is true; MessagePort is asynchronous, and the firing of
> onaudioprocess events would be synchronous (more to the point, the system
> would, I expect, resolve the entire audio graph for a block before
> processing messages).  I don't expect Audio Workers to communicate with
> each other, and I don't believe they can observe each others' behavior.
>
>
>> 2. I feel very strongly against exposing this node type as a web worker.
>> I think that has a number of undesired side effects.  Here are some
>> examples:
>>
>> a) Even though the HTML web worker termination algorithm says nothing
>> about killing an underlying worker thread, I think that is what authors
>> would typically expect to happen, but obviously doing that would not be an
>> option here.  It is also not specified what needs to be output from the
>> node after terminate() has been called on it (the input unmodified?
>> silence? something else?)  Also, given the fact that you can already
>> disconnect the node from the graph, why do we need the terminate() method
>> in the first place?
>>
>
> I'm not sure why that's not an option, and the terminate() description
> DOES say the worker thread should be terminated.  (not the underlying
> system thread, but that's not intended.)  As Alex said, terminate() is
> there for uniformity with Workers, and for clarity in enforcing "no more
> onaudioprocess events, please."
>
> b) The API gives you the illusion that multiple AudioWorkerNode's will run
>> on different DedicatedWorkerScopes.  That, in Web Workers world, would mean
>> different threads, but that will not be the case here.  I think that
>> discrepancy is problematic.
>>
>
> Why?
>
>
>> c) Using DedicatedWorkerGlobalScope is a bit weird in terms of APIs that
>> are exposed to workers.  For example, should these workers support
>> onlanguagechange?  What about IndexedDB on workers?  What about nested Web
>> Workers?
>>
>
> Yes, I would think they would support these.  Particularly nested Web
> Workers, e.g.
>
>
>> d) At least on Gecko, Web Workers have specific implementation concerns
>> in terms of their message queue, event processing model and so on.  It
>> might be a lot of effort for us to properly implement the observable
>> behavior of these workers running inside our audio processing thread code
>> (which has a completely different event processing model, etc.)
>>
>
> That's the point of having this discussion, yes.
>
>
>> e) The topic of whether or not synchronous APIs must be allowed on
>> workers is being debated on public-script-coord, and it seems like there is
>> no consensus on that yet.  But I find the possibility of running
>> synchronous XHR on the audio processing thread unacceptable for example,
>> given its realtime requirements.
>>
>
> Of course.  That (using synchronous XHR) would be a dumb thing to do. So
> would "while (true) ;".
>
> I think tying this node to Web Workers opens a huge can of worms.  It will
>> also keep biting us in the future as more and more APIs are exposed to
>> workers.  I am wondering if we can get away with a completely different API
>> that only tries to facilitate the things that authors would typically need
>> to do during audio processing without attempting to make that a Web Worker?
>>
>
> Only by using some other language to represent an "audio shader", and
> having that language runtime run in the audio thread.  Anything else would
> add inherent latency, which would destroy the ability of this to
> self-describe Web Audio.
>
>
>> I think at the very least, if we really want to keep tying these concepts
>> to Web Workers, it might be worthwhile to bring that up on
>> public-script-coord, since it is at least bending the original use cases
>> that Web Workers were designed for.  :-)
>>
>
> Sure, I'm happy to ask public-script-coord for a review.
>
>
>> 3. What is the purpose of addParameter and removeParameter?  It seems to
>> me that if we defined a structured clone algorithm for AudioParam (which is
>> another detail missing from the spec that needs to be clarified anyway) and
>> make it transferable, then the author would be able to postMessage() an
>> AudioParam just as easily.  Is there a good reason to have a specialized
>> method for what is effectively posting an AudioParam to the worker?  (Note
>> that we'd probably need to add methods to AudioParam for extracting a-rate
>> and k-rate values at any given time in that case, so that the worker script
>> can get the right values when it needs them.)
>>
>
> No.  To have AudioParams be connectable (i.e. connect an Oscillator into
> an AudioParam) with zero latency, they must be evaluated in the audio
> thread, only in the current block, only when they are needed.  You can't
> extract ahead of time, or ask for them later; their value buffers need to
> be handed to the onaudioprocess when it fires.
>
>
>> 4. More on the idea of transferring an AudioParam to a worker, that will
>> probably involve some kind of neutering the object.  It might make sense to
>> introduce a clone() method on AudioParam as a way for authors to be able to
>> keep a copy of the object around on the main thread.  That could of course
>> be a future enhancement idea.
>>
>
> In essence, the design of AudioParams here is much like AudioNodes
> themselves - one part sits in the main thread, one part sits in the audio
> thread.  The latter does the heavy lifting.
>
>
>> 5. Still more on the idea of transferring an AudioParam, another thing
>> that we need to worry about is what happens if you try to transfer an
>> AudioParam that is currently being used somehow (either through being a
>> property on another AudioNode, or being used as an input to one.)
>>
>
> How would you make such a transfer?
>
> -Chris
>
Received on Thursday, 11 September 2014 12:56:24 UTC