Re: Status v2 from Paul Adenot on 2020-04-17 (public-audio@w3.org from April to June 2020)

From: Paul Adenot <padenot@mozilla.com>
Date: Fri, 17 Apr 2020 19:38:12 +0200
To: André Michelle <andre.michelle@audiotool.com>
Cc: "public-audio@w3.org Group" <public-audio@w3.org>
Message-ID: <CANWt0Wor2Jeg+XiQyrkCAyS23j2Q5mqDBk9gRXHMkD2kUNVSgw@mail.gmail.com>
André,

On Thu, Apr 16, 2020 at 9:07 PM André Michelle <andre.michelle@audiotool.com>
wrote:

>
> is there some kind of summary for the most important key features? There
> are a lot of issues on github that deal with details of existing
> audio-nodes.
>

There are some improvements, because people want to use audio nodes, and
would like them extended. But I agree that a good triage of the backlog
would be in order, it's a mess at the minute.


> I think for many developers that are building complete DAWs with their own
> infrastructure, improving audio-nodes is not that interesting. We are
> looking for the most stable solution to run our own dsp-code in a different
> prioritised thread without any performance crucial dependencies like the
> main gc,  event loop or graphic updates (while I understand that the
> communication between those threads will introduce message-latency when the
> browser is busy, but that is neglectable).More important is getting rid of
> glitches (pops, clicks), that the main thread introduces.
>

No amount of specification will fix that, this is clearly an implementation
detail (apart from things like
https://github.com/WebAudio/web-audio-api/issues/1933, and that's
stretching it). Communication between threads doesn't have to add latency,
we have SharedArrayBuffer now, see below.


> We are already pooling objects whenever they have a short lifetime
>

(unrelated, but this is an anti-pattern in our day and age of generational
GC in modern JS engines, if you're using a GC-ed langage,
https://hacks.mozilla.org/2014/09/generational-garbage-collection-in-firefox/
explains why, but this is in no way limited to Firefox)


> , but the overall performance seems to be worse than in Flash (where
> everything ran in the main thread). But that is of course a subjective
> observation, since we cannot compare both properly. However many users ask
> for the Flashversion of audiotool, because it ran better for
> them. Especially on Chromebooks. Many schools in the US are using audiotool
> on Chromebooks.
>
> How can I contribute?
> My guess is that fixing those issues is rather about the internal
> infrastructure of the browser than specifications. In theory the
> audio-worklet is the way to go.
>

It is. We've investigated a simpler API (called Audio Device Client), but
after some time implementing AudioWorklet, it became apparent that we could
not get any more performances with a different API design. If this
conclusion is proven to have been wrong, there is room in the Web Audio API
charter to investigate another solution than AudioWorklet.

The internals of browsers don't really have any influence on specification
work (sometimes it's the case, but in general it's explicitly against the
design rules of the web platform).


>
> The only case I can make is to have an adjustable audio-ring-buffer
> (relaxing the 128 frame block-size) to better handle short performance
> peaks and make it completely independent from the main-thread. I
> understand, it is easier said than done. Does the v2 specifications include
> that?
>

It does, but 128 frame block size is not an IO vector size (as it's often
called). It's a block processing size. Browsers can (and do) choose a
different IO vector size, and you can change it yourself for what is best
for your application using the `latencyHint` parameter, when creating and
AudioContext:
https://webaudio.github.io/web-audio-api/#dom-audiocontextoptions-latencyhint
.

Changing the buffer size is however
https://github.com/WebAudio/web-audio-api-v2/issues/13, because it has
other performance implications as explained in the issue.

The AudioWorklet is already completely independent of the main thread (or
any other thread or process for that matter) and can run on a real-time
thread, we've been very careful to specify it this way (part of the reason
why it took so long).


> Are you trying to merge v2 into v1?
>

That's the idea I think (it has not been decided), however it's unclear if
it is something that we'll succeed in doing. It's quite challenging in
terms of backwards compat.

That would explain the issues for existing audio-nodes. I guess for most
> advanced audio programmers a parallel v2 with a very simple api (config and
> push callback like it was in pepper) would solved many problems we are
> facing right now.
>

This would provide no improved performances, and you can already do it
today. Careful and extensive measurements have shown that the overhead of
the AudioWorklet can be made minimal (and is, in Firefox), compared to C
code running outside of a browser. WASM speed is enough for a lot of uses
as well (but it would be lying to say it's as fast as native code, today,
we see a 1.5x to 2x slowdown on the type of workloads we care about here).

Having a config and push callback would be a regression in terms of latency
and reliability, but if you want to do it (there are good reasons), you can
do it with AudioWorklet. Because I know this is a common request, I've
written a library that does the heavy lifting for this, in a way that is
the most performant possible: https://github.com/padenot/ringbuf.js.This is
very liberally licensed intentionally, but I'm happy to hear about any
feedback (positive or negative). This would be by definition a priority
inversion however, so care must be taken to buffer enough to avoid
underruns.


>
> This may sound blunt, but let the web-developers build their own
> audio-frameworks. In no time we will have all the currently existing v1
> audio-nodes covered and way more in the future. The actual issue is not
> making better nodes. Developers will always ask for more. But giving them a
> universal access point to a prioritised, configurable thread will certainly
> create more impact, very similar to WebGL (where many frameworks were built
> with amazing features for different tasks).
>

This is what developers have been doing for some time with AudioWorklet,
and you can do it today (other people/company/projects have done it with
great success, in particular compiling commercial programs into WASM and
running them on the web).

What you're describing is exactly what AudioWorklet is: a prioritized
thread, with a configurable audio output callback (in terms of latency,
device, channel count, and at some stage in v2 block processing size), but
on top of that it is composable with other nodes, so if you want to stream
any audio from the web it's a couple lines of code, and if you want a
multi-threaded convolver it's another two lines of code. Opposing
AudioWorklet/custom DSP and regular audio nodes is not useful. Some
use-cases are better implemented in one, others using the other technique,
and quite a lot are using both at the same time.

Hope this helps,
Paul.
Received on Friday, 17 April 2020 17:38:39 UTC