Re: Audio Workers - please review from Joseph Berkovitz on 2014-09-11 (public-audio@w3.org from July to September 2014)

From: Joseph Berkovitz <joe@noteflight.com>
Date: Thu, 11 Sep 2014 11:54:18 -0400
To: Ehsan Akhgari <ehsan@mozilla.com>
Cc: Chris Wilson <cwilso@google.com>, "public-audio@w3.org" <public-audio@w3.org>
Message-Id: <4ECC5CD2-ED08-4D27-8934-7589519CC906@noteflight.com>
Hi Ehsan,

It’s great that you provided this critique of what will be one of the most important evolutions of the spec. Chris has provided what I think is a good fine-grained initial response to your thoughts, and I hope the group devotes some teleconference time to any outstanding issues that remain.

For my part I want to provide a coarse-grained, developer-oriented opinion on some of the issues you raised. Some of these may require more work in the proposed spec because I think your questions cast light on some problems in the language we’re using in the current AudioWorkerNode proposal.

1. Processing model and timing issues. I don’t think we need to nail down much about the exact semantics of onaudioprocess callbacks, nor do I think developers can really discover (or usefully exploit) any sort of global timing coordination between onaudioprocess and on message/postMessage(). I think it may be useful to shift our perspective, and start talking about onaudioprocess as being an asynchronous message from a developer’s perspective, not an synchronous message. The spec currently uses the word “synchronous” to describe onaudioprocess, which leads down a dangerous path for developer thinking.

The fact that onaudioprocess is synchronous within something we currently call the “audio thread” is really an optimization, not an aspect of the specification. I think we should perhaps have the spec say something more like, “The UA calls onaudioprocess in a manner that attempts to best optimize latency and freedom from glitching."

When developers implement a node, onaudioprocess can be considered as a special type of async message to the AudioWorker originating from the UA, with no need for guarantees about its timing or notions of synchronicity with an invisible concept like the audio thread. Developers should be encouraged to write nodes in a style that considers each onaudioprocess to be an asynchronous, granular little processing job, not something that is happening in some known, definite time relationship to anything else. I think it’s dangerous to say anything that attempts to nail down the behavior and timing of the audio thread (and in the future, probably that will be audio threadS plural). Onaudioprocess “quasi-messages” simply arrive in a guaranteed order and cover a contiguous sequence of time ranges, starting at some context time when the node starts and ending when the node stops.

I also think we need to say something about the sequencing and monotonically-extending time ranges covered by successive onaudioprocess calls.

2. Exposure of the WebWorker interface. It seems to me that we can follow two paths here: a) eliminate postMessage/onmessage support, do not expose WebWorker, and rely on AudioParams alone for all main-thread communication with scripted nodes, or b) retain postMessage/onmessage support, in which case it seems to me we are actually better off inheriting the full Web Worker interface to avoid building a new animal which looks like a WebWorker but doesn’t walk or talk like one.

Since the recent inclusion of add/removeParameter in the proposal, did anyone (especially Chris :-) consider whether we still truly need postMessage/onmessage support? If we removed it, it would render moot a lot of arguments about what happens when nodes try to talk to each other and might simplify everything a lot. Internode communication seems to me a way to cause a lot of mistaken assumptions re: synchronicity (see above). AudioParams seem cleaner and more in line with what native nodes do.

3-5. AudioParam transferrability: I can’t really see the use case for AudioParam transferability. AudioParams seem to be the preferred channel for communication between the main thread and scripted nodes, and they stand alone in supporting that communication. Why would we transfer them in a separate message?


.            .       .    .  . ...Joe

Joe Berkovitz
President

Noteflight LLC
Boston, Mass.
phone: +1 978 314 6271
www.noteflight.com
"Your music, everywhere"


On Sep 10, 2014, at 12:29 PM, Ehsan Akhgari <ehsan@mozilla.com> wrote:

> Hi Chris, and everyone else on the WG!
> 
> I took a look at the merged version of this proposal and have some feedback, I would appreciate if you can please let me know what you think.  As you know, I haven't been closely involved with the spec for a while, so if the below reflects my ignorance on some of the previously discussed matters, please accept my apologies in advance.  And of course, sorry that this is a *long* list.  :-)
> 
> 1.  I think the processing model can use some further clarification.  The current text of the spec is very vague on a number of different issues, so I would appreciate if you could please clarify what you intended to happen in the following cases:
> 
> a) It is not really clear what should happen after you call terminate() on AudioWorkerNode, since that seems to be orthogonal to whether the node participates in the graph.
> 
> b) The interaction between worker closing/termination and the audio processing is not clear at all.  Specifically, dedicated workers typically have a long time that is as long as the last pending task (assuming that the main thread doesn't hold a reference to them), however, for these workers, we need to keep firing audioprocess events.
> 
> c) How would different AudioWorkerNodes allocated through different AudioContext objects interact with each other?  (Here is an example of something that is not clear, should it be allowed to send an AudioParam belonging to a node form context 1 to a worker node from context 2?  The spec is currently silent about this.)
> 
> d) What is the exact semantics of the audioprocess events dispatched to these workers?  Do they block running all audio processing on the audio processing thread?  Note that because of the halting problem, we cannot even verify that the worker script will ever terminate, let alone in a reasonable amount of time, so in practice the UAs need to preempt the execution of the script.  It would be nice if this was somehow mentioned in the spec, at least in terms of what the UA needs to do if it decides to abort the execution of one of these scripts.
> 
> e) What is the order in which these worker nodes receive the audioprocess events (assuming obviously that one such node is not an indirect input/output of the other)?  Note that with the currently specified API, I think it is possible for one of these workers to send a MessagePort through the main thread to another one, and therefore be able to communicate with the other AudioWorkerNode workers through MessagePort.postMessage, so the order of execution is observable to script.  (I admit that I don't have a good solution for this -- but I also don't know what use cases for transferring information back and forth between these nodes and the main thread we're trying to address here.)
> 
> 2. I feel very strongly against exposing this node type as a web worker.  I think that has a number of undesired side effects.  Here are some examples:
> 
> a) Even though the HTML web worker termination algorithm says nothing about killing an underlying worker thread, I think that is what authors would typically expect to happen, but obviously doing that would not be an option here.  It is also not specified what needs to be output from the node after terminate() has been called on it (the input unmodified? silence? something else?)  Also, given the fact that you can already disconnect the node from the graph, why do we need the terminate() method in the first place?
> 
> b) The API gives you the illusion that multiple AudioWorkerNode's will run on different DedicatedWorkerScopes.  That, in Web Workers world, would mean different threads, but that will not be the case here.  I think that discrepancy is problematic.
> 
> c) Using DedicatedWorkerGlobalScope is a bit weird in terms of APIs that are exposed to workers.  For example, should these workers support onlanguagechange?  What about IndexedDB on workers?  What about nested Web Workers?
> 
> d) At least on Gecko, Web Workers have specific implementation concerns in terms of their message queue, event processing model and so on.  It might be a lot of effort for us to properly implement the observable behavior of these workers running inside our audio processing thread code (which has a completely different event processing model, etc.)
> 
> e) The topic of whether or not synchronous APIs must be allowed on workers is being debated on public-script-coord, and it seems like there is no consensus on that yet.  But I find the possibility of running synchronous XHR on the audio processing thread unacceptable for example, given its realtime requirements.
> 
> I think tying this node to Web Workers opens a huge can of worms.  It will also keep biting us in the future as more and more APIs are exposed to workers.  I am wondering if we can get away with a completely different API that only tries to facilitate the things that authors would typically need to do during audio processing without attempting to make that a Web Worker?
> 
> I think at the very least, if we really want to keep tying these concepts to Web Workers, it might be worthwhile to bring that up on public-script-coord, since it is at least bending the original use cases that Web Workers were designed for.  :-)
> 
> 3. What is the purpose of addParameter and removeParameter?  It seems to me that if we defined a structured clone algorithm for AudioParam (which is another detail missing from the spec that needs to be clarified anyway) and make it transferable, then the author would be able to postMessage() an AudioParam just as easily.  Is there a good reason to have a specialized method for what is effectively posting an AudioParam to the worker?  (Note that we'd probably need to add methods to AudioParam for extracting a-rate and k-rate values at any given time in that case, so that the worker script can get the right values when it needs them.)
> 
> 4. More on the idea of transferring an AudioParam to a worker, that will probably involve some kind of neutering the object.  It might make sense to introduce a clone() method on AudioParam as a way for authors to be able to keep a copy of the object around on the main thread.  That could of course be a future enhancement idea.
> 
> 5. Still more on the idea of transferring an AudioParam, another thing that we need to worry about is what happens if you try to transfer an AudioParam that is currently being used somehow (either through being a property on another AudioNode, or being used as an input to one.)
> 
> 6. I think addressing #3 above will allow us to completely eliminate AudioProcessEvent, and just use AudioProcessingEvent.
> 
> I hope the above is useful.  I think there is much more to think about especially given what we end up deciding on the first two items above, so I hope to provide further feedback as we make progress here.
> 
> Please let me know what you think!
> 
> Cheers,
> Ehsan
> 
> 
> On Mon, Aug 25, 2014 at 11:29 AM, Chris Wilson <cwilso@google.com> wrote:
> I've done some tweaking to the Audio Worker (issue #113) proposal, and most significantly added the ability to create AudioParams on Audio Workers (issue #134).
> 
> The fork is hosted on my fork (http://cwilso.github.io/web-audio-api/).  Start here to review the creation method, and the bulk of the text begins at http://cwilso.github.io/web-audio-api/#the-audio-worker.
Received on Thursday, 11 September 2014 15:54:53 UTC