Re: Audio Workers - please review from Ehsan Akhgari on 2014-09-10 (public-audio@w3.org from July to September 2014)

From: Ehsan Akhgari <ehsan@mozilla.com>
Date: Wed, 10 Sep 2014 12:29:47 -0400
To: Chris Wilson <cwilso@google.com>
Cc: "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CANTur_4PhxUbHecZzo=S7O6zM-TMLZbU0GvHXevn2o=VzQrFfw@mail.gmail.com>

Hi Chris, and everyone else on the WG!

I took a look at the merged version of this proposal and have some
feedback, I would appreciate if you can please let me know what you think.
As you know, I haven't been closely involved with the spec for a while, so
if the below reflects my ignorance on some of the previously discussed
matters, please accept my apologies in advance. And of course, sorry that
this is a *long* list. :-)

1. I think the processing model can use some further clarification. The
current text of the spec is very vague on a number of different issues, so
I would appreciate if you could please clarify what you intended to happen
in the following cases:

a) It is not really clear what should happen after you call terminate() on
AudioWorkerNode, since that seems to be orthogonal to whether the node
participates in the graph.

b) The interaction between worker closing/termination and the audio
processing is not clear at all. Specifically, dedicated workers typically
have a long time that is as long as the last pending task (assuming that
the main thread doesn't hold a reference to them), however, for these
workers, we need to keep firing audioprocess events.

c) How would different AudioWorkerNodes allocated through different
AudioContext objects interact with each other? (Here is an example of
something that is not clear, should it be allowed to send an AudioParam
belonging to a node form context 1 to a worker node from context 2? The
spec is currently silent about this.)

d) What is the exact semantics of the audioprocess events dispatched to
these workers? Do they block running all audio processing on the audio
processing thread? Note that because of the halting problem, we cannot
even verify that the worker script will ever terminate, let alone in a
reasonable amount of time, so in practice the UAs need to preempt the
execution of the script. It would be nice if this was somehow mentioned in
the spec, at least in terms of what the UA needs to do if it decides to
abort the execution of one of these scripts.

e) What is the order in which these worker nodes receive the audioprocess
events (assuming obviously that one such node is not an indirect
input/output of the other)? Note that with the currently specified API, I
think it is possible for one of these workers to send a MessagePort through
the main thread to another one, and therefore be able to communicate with
the other AudioWorkerNode workers through MessagePort.postMessage, so the
order of execution is observable to script. (I admit that I don't have a
good solution for this -- but I also don't know what use cases for
transferring information back and forth between these nodes and the main
thread we're trying to address here.)

2. I feel very strongly against exposing this node type as a web worker. I
think that has a number of undesired side effects. Here are some examples:

a) Even though the HTML web worker termination algorithm says nothing about
killing an underlying worker thread, I think that is what authors would
typically expect to happen, but obviously doing that would not be an option
here. It is also not specified what needs to be output from the node after
terminate() has been called on it (the input unmodified? silence? something
else?) Also, given the fact that you can already disconnect the node from
the graph, why do we need the terminate() method in the first place?

b) The API gives you the illusion that multiple AudioWorkerNode's will run
on different DedicatedWorkerScopes. That, in Web Workers world, would mean
different threads, but that will not be the case here. I think that
discrepancy is problematic.

c) Using DedicatedWorkerGlobalScope is a bit weird in terms of APIs that
are exposed to workers. For example, should these workers support
onlanguagechange? What about IndexedDB on workers? What about nested Web
Workers?

d) At least on Gecko, Web Workers have specific implementation concerns in
terms of their message queue, event processing model and so on. It might
be a lot of effort for us to properly implement the observable behavior of
these workers running inside our audio processing thread code (which has a
completely different event processing model, etc.)

e) The topic of whether or not synchronous APIs must be allowed on workers
is being debated on public-script-coord, and it seems like there is no
consensus on that yet. But I find the possibility of running synchronous
XHR on the audio processing thread unacceptable for example, given its
realtime requirements.

I think tying this node to Web Workers opens a huge can of worms. It will
also keep biting us in the future as more and more APIs are exposed to
workers. I am wondering if we can get away with a completely different API
that only tries to facilitate the things that authors would typically need
to do during audio processing without attempting to make that a Web Worker?

I think at the very least, if we really want to keep tying these concepts
to Web Workers, it might be worthwhile to bring that up on
public-script-coord, since it is at least bending the original use cases
that Web Workers were designed for. :-)

3. What is the purpose of addParameter and removeParameter? It seems to me
that if we defined a structured clone algorithm for AudioParam (which is
another detail missing from the spec that needs to be clarified anyway) and
make it transferable, then the author would be able to postMessage() an
AudioParam just as easily. Is there a good reason to have a specialized
method for what is effectively posting an AudioParam to the worker? (Note
that we'd probably need to add methods to AudioParam for extracting a-rate
and k-rate values at any given time in that case, so that the worker script
can get the right values when it needs them.)

4. More on the idea of transferring an AudioParam to a worker, that will
probably involve some kind of neutering the object. It might make sense to
introduce a clone() method on AudioParam as a way for authors to be able to
keep a copy of the object around on the main thread. That could of course
be a future enhancement idea.

5. Still more on the idea of transferring an AudioParam, another thing that
we need to worry about is what happens if you try to transfer an AudioParam
that is currently being used somehow (either through being a property on
another AudioNode, or being used as an input to one.)

6. I think addressing #3 above will allow us to completely eliminate
AudioProcessEvent, and just use AudioProcessingEvent.

I hope the above is useful. I think there is much more to think about
especially given what we end up deciding on the first two items above, so I
hope to provide further feedback as we make progress here.

Please let me know what you think!

Cheers,
Ehsan

On Mon, Aug 25, 2014 at 11:29 AM, Chris Wilson <cwilso@google.com> wrote:

> I've done some tweaking to the Audio Worker (issue #113
> <https://github.com/WebAudio/web-audio-api/issues/113>) proposal, and
> most significantly added the ability to create AudioParams on Audio Workers
> (issue #134 <https://github.com/WebAudio/web-audio-api/issues/134>).
>
> The fork is hosted on my fork (http://cwilso.github.io/web-audio-api/).
> Start here
> <http://cwilso.github.io/web-audio-api/#widl-AudioContext-createAudioWorker-AudioWorkerNode-DOMString-scriptURL-unsigned-long-numberOfInputChannels-unsigned-long-numberOfOutputChannels>
> to review the creation method, and the bulk of the text begins at
> http://cwilso.github.io/web-audio-api/#the-audio-worker.
>

Received on Wednesday, 10 September 2014 16:31:00 UTC