Re: Proposal for Audio Track Worklet API: github.com/alvestrand/audio-worklet from Harald Alvestrand on 2018-10-11 (public-webrtc@w3.org from October 2018)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Thu, 11 Oct 2018 06:00:54 +0200
To: public-webrtc@w3.org
Message-ID: <b96b1870-b54e-5c37-6ecf-2dbbb2665e54@alvestrand.no>
Youenn, thanks for the comments!

On 10/11/2018 01:12 AM, youenn fablet wrote:
> Hi Harald,
>
> This is an interesting proposal, some feedback/questions below.
>
> The proposal has some overlap with existing technologies and it is worth understanding how they relate.
>
> For instance, ScriptProcessorNode provides ways to access to the samples, though on the main thread.

ScriptProcessor is marked as deprecated in the current WebAudio spec -
it seems to have caused performance issues. I took that as evidence
enough not to pursue that shape of API further.

> As stated in the draft, WebAudio Worklet could be emulated through AudioWorkletNode.
> The question is then why to not use AudioContext/AudioWorkletNode directly.
> If not possible, would a WebRTCAudioContext be able to solve these issues?

The people who have tried to use AudioWorklet for signals processing
have reported issues with performance.

While everything can be optimized, I think that some of the reasons for
bad performance is architectural decisions - particularly the choice of
using a synchronized clock (which requires resampling before processing
if you have out-of-sync audiotracks) and the choice of a single (float32
linear) representation of samples (which requires a conversion step).

I'd like to explore further what we can achieve with an API that is
closer in spirit to the C++ APIs of Google's WebRTC implementation,
while still acting like a sensible part of the Web platform.

>
> In terms of API, the model seems to mandate a one input/one optional-output model.
> I guess some cases (sub/super-sampling, mixing, fanning out) cannot probably be handled.
Subsampling and supersampling should be doable, since these are
one-track operations.

Mixing and fanout can't be handled this way, but mixing requires
synchronization, which requires resampling - see previous point.
This is an important tradeoff point - we should make the choice
carefully here.
If the code in the worklet is willing to do resampling and
synchronization itself, shared array buffers offers a relatively
performant way of shuffling samples between tracks, so mixing may be
possible, albeit with a somewhat more convoluted programming model than
the WebAudio one.

> I guess the idea is to use WebAudio for those cases instead.
> The question is then what are the cases AudioMediaTrackProcessor should be used for and what are the cases WebAudio should be used instead.

I think it depends on what overhead is tolerable for the application.

>
> Thanks,
>  Y
>
>> On Oct 10, 2018, at 1:50 AM, Harald Alvestrand <harald@alvestrand.no> wrote:
>>
>> As part of my homework from the June WG meeting, and in preparation for
>> TPAC, I have started drawing up a proposal for a worklet that allows us
>> to process audio.
>>
>> Link to the presentation form: https://alvestrand.github.io/audio-worklet/
>>
>> I haven't made this a generic processor for audio and video, because I
>> think efficient processing of video (especially large-frame video) will
>> require significantly more attention to buffering and utiliziation of
>> platform-embedded processors (GPUs!) than is required for usable audio
>> processing.
>>
>> Note: This proposal (or even this general idea) is a PROPOSAL TO the WG
>> - it does not represent any form of decision.
>>
>> Comments welcome!
>>
>> Harald
>>
>> -- 
>> Surveillance is pervasive. Go Dark.
>>
>>
>>
>

-- 
Surveillance is pervasive. Go Dark.
Received on Thursday, 11 October 2018 04:01:21 UTC