Re: Audio processing with web workers from Dmitry Lomov on 2012-02-15 (public-audio@w3.org from January to March 2012)

From: Dmitry Lomov <dslomov@google.com>
Date: Tue, 14 Feb 2012 16:12:37 -0800
To: Chris Rogers <crogers@google.com>, public-audio@w3.org
Message-ID: <CAObu7DEvJgXsxCNHd+Rs=unqvNiFF9Wf1rAcWYWJ5C1jgp5buw@mail.gmail.com>
Hi web-audio,

Using workers for direct processing of audio stream in JavaScript is very
attractive idea - this makes performant audio effects in JavaScript a
reality.

The basic approach - audio subsystem calling a callback directly on worker
thread - looks good, although giving the worker to
JavaScriptAudioNode/ProcessedMediaStream, so that they call a named
callback on it is a novel pattern.

Here are some comments on API refinement, from the point of view of better
supporting workers. These comments apply both to Chris's and to Robert's
proposals.

1. Event argument to worker audio processing callback

Looking at Robert's example (
http://people.mozilla.org/~roc/stream-demos/tone.js), the 'onprocessmedia'
callback allocates a Float32Array on every invocation. This is wasteful and
degrades performance.
It will be better if event object exposed an output Float32Array directly
so that 'onprocessmedia' could write directly to it:
   self.onprocessmedia = funcition(event) {
      ...
      event.output[i] = ....
      ...
   }
the implementation would then preallocate one output array for every
processing callback (every JSAN/ProcessedMediaStream).
The output array would then be directly accessed by audio processing system
(presumably on a different thread) as well as by worker, so no copying will
be involved

This will be very efficient, however this poses a problem - what if worker
stores a reference to event object in a global, such as:
   var eventHolder;

   self.onmediaprocess = function(event) {
      eventHolder = event; // Danger, Will Robinson!
..
   }

   eventHolder.output[42] = 0xBADF00D; // Kaboom!

this is a problem that is already solved for ArrayBuffer transfer and
postMessage: we should ensure that array buffers that are arguments to
'onmediaprocess' are, in Transferrable parlance, "neutered", i.e. their
effective length becomes zero. It will behave as if array buffers are
transferred in on invocation of 'onmediaprocess' and transferred out when
'onmediaprocess' exits.

Same applies to the input buffer as well.

2. Allocation of processing nodes/streams to workers

Both current proposals suggest that any worker can handle only one
JavaScriptAudioNode/ProcessedMediaStream (since the pocessing nodes run a
predefined callback on the worker).
In some UAs (at least in all WebKit-based ones), worker is a relatively
heavy thing (it is a whole OS thread, as well as associated JS execution
context resources), so requiring to spawn too many of them is wasteful. It
will be nice to be able to share workers between processing nodes, esp. if
the processing nodes are not parallel in the audio graph.

I suggest, instead of calling a callback with a predefined name
('onmediaprocess'), allow to pass a callback name to
JSAN/ProcessMediaStreamConstructor:

  worker.js:
    self.sinGenerator = function(event) { ... }

  index.html:
   worker = new Worker('worker.js');
   var stream = new ProcessedMediaStream(worker, "sinGenerator");
or:
   var audioNode = new JavaScripAudioNode(.... worker, "sinGenerator")

This gives the API user a fine grained control over the distribution of
load between workers.
Also, it uncouples audio API nicely from workers API (no need to specify an
extra method in DedicatedWorkerGlobalScope).

3. Handling of 'params'
(this one is purely cosmetic suggestion)

Instead of making params a field of event object, maybe it will be cleaner
to make params an extra argument to callback, and also pass initial value
on node/stream construction?
As in (I am also using my passing-callback-name suggestion here):
 worker.js:
  self.sinGenerator = function(event, frequency) { .... }
 index.html:
  worker = new Worker('worker.js');
  var stream = new ProcessedMediaStream(worker, "sinGenerator", 440);
  var stream2 = new ProcessedMediaStream(worker, "sinGenerator", 550);
or:
  var audioNode = new JavaScriptAudioNode(..., worker, "sinGenerator", 440);
  var audioNode2 = new JavaScriptAudioNode(..., worker, "sinGenerator",
550);

the argument to the callback can still be manipulated by stream.params or
audioNode.params.

Note in passing that having 'params' object as described opens an extra
communication channel for passing arbitrary objects between main thread and
workers. Structured cloning algorithm applies here. Also, the main thread
can always communicate with the worker via postMessage as well (so e.g.
some large intializations can be done that way, esp. using array buffer
transfer).

Let me know what you think!

Kind regards,
Dmitry



On Fri, Feb 10, 2012 at 12:42 PM, Chris Rogers <crogers@google.com> wrote:

> From recent discussions it seems that there's general agreement for both
> the need to deliver high-performance native audio processing, plus the
> ability to have low-level access to the PCM audio stream for direct
> processing in JavaScript.  The Web Audio API currently has both, except
> that the JavaScriptAudioNode currently runs the JS code in the main thread.
>  Robert's proposal includes the ability to run the JS code in a worker
> thread, which I agree can be useful.
>
> From the perspective of how this would work in the Web Audio API, Jussi
> has asked about the possibility of extending the JavaScriptAudioNode to
> accept a web worker in the constructor.  I've looked in more detail at
> Robert's proposal and his simple example which generates a tone, with pitch
> controllable by the mouse:
> http://people.mozilla.org/~roc/stream-demos/worker-generation.html
> http://people.mozilla.org/~roc/stream-demos/tone.js
>
> Aside from our disagreement about larger aspects of the API, my impression
> was that Robert's specific approach to workers seems quite reasonable and
> could be adopted in an almost identical way in JavaScriptAudioNode.  I've
> since discussed some of the technical aspects with Google's web workers
> expert Dmitry Lomov.  And he brought up some interesting ideas and
> improvements as well, which I hope we can discuss here.
>
> But first, here's my naive translation of Robert's worker's approach to
> JavaScriptAudioNode (which should be refined):
>
> partial interface AudioContext {
>     JavaScriptAudioNode createJavaScriptNode(in short bufferSize, in short
> numberOfInputChannels, in short numberOfOutputChannels,
>         in Worker worker);
> };
>
> The code would be slightly different with "self.onprocessmedia" ->
> "self.onaudioprocess" to match the current JavaScriptAudioNode naming.
>  Also, the "event" would be the same as in the current JavaScriptAudioNode,
> but could have Roc's "params".
>
> Cheers,
> Chris
>
>
Received on Wednesday, 15 February 2012 22:22:27 UTC