Re: TPAC F2F and Spec Proposals from Anthony Bowyer-Lowe on 2011-10-18 (public-audio@w3.org from October to December 2011)

From: Anthony Bowyer-Lowe <anthony@lowbroweye.com>
Date: Tue, 18 Oct 2011 17:45:53 +0100
To: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Cc: public-audio@w3.org
Message-ID: <CAMCSOPWw=dxkvGxyd1UFYhXk-d6TL+HJE1bua1ymHAPLxQyJRA@mail.gmail.com>
Hi Jussi,


> There are few differences for the processing audio use cases such as
>> providing spectral visualisations of playing music, or echo cancellation of
>> webcam calls. The MediaStream API is very satisfactory for this.
>>
>> However, for the purposes of realtime synthesis & sample playback and
>> videogame audio feedback, where low latency, low jitter sound
>> generation/triggering and direct user interaction are required then the Web
>> Audio API's focus upon canonical audio formats and strong timing make it far
>> more useful than the MediaStream API which offers none of these
>> capabilities.
>>
>> I beg to differ here, as the media stream API offers the sample level
> access, so there are no boundaries per se, other than those of hardware
> limitations and JS speed, take a look at http://jams.no.de , it's pure JS
> generated music with no native effects whatsoever, made to work with the
> current Audio Data API and Web Audio API. It works quite seamlessly
> especially considering it's running the graphics and audio in the same
> thread, which is something you really shouldn't do. This is in my opinion
> actually a flaw in the current implementations of the aforementioned APIs,
> you can't take the audio processing into another thread, but that is
> possible, and in fact the only way, in Media Streams API. The example demo
> could quite easily be changed to work with Media Streams API, thus making it
> much more efficient.
>

Absolutely! Anything that provides direct sample buffer access can of course
be used to generate audio given appropriate levels of development effort as
you've so effectively proven. I'm no stranger myself to building entire
audio systems up from the lowest hardware levels and it's likely that the
audio projects I produce in-browser will forego most of the native effects
for my own custom implementations and I've certainly worked against worse
APIs than that of MediaStream.

However, simply having access to some basic native helpers for stream
summing, panning and convolution will greatly lighten the CPU expense of
doing that work in JavaScript. Sure, these could be considered minor
optimisations but every cycle counts in realtime audio, particularly on
mobile devices, and achieving 40ms latency instead of 50ms, or 6 voices
instead of 5 is hugely significant when it comes to instrument synthesis. As
you say, the boundaries are hardware limitations and JS speed so reductions
in those needs are desirable as the gains can be reinvested in the primary
goal: generating great sound.

And whilst they are little more than underspecified side notes at present,
that the Web Audio spec mentions considerations for preflighting and CPU
usage monitoring gives me solace that it will be an appropriate platform for
extensible future development. Do you think the MediaStream API will expand
to support such needs given it's stated focus? Letting users suffer glitches
and telling them to upgrade is not an acceptable user experience in my book.

Sure, devices and JavaScript VMs are continually getting faster but I do
find it frustratingly funny that our modern browsers running on hardware
that can support DAW processing power unimaginable in even the past decade
still can't approach the realtime sound synthesis capabilities, low
latencies, and sonic qualities of 30 year old microcomputers. To be honest,
I'm API agnostic: I just want the opportunity to deliver cool realtime
interactive audio systems in-browser that meet my personal performance
quality criteria.

Anyway, I wholeheartedly agree that having the proposed APIs support audio
processing in non-UI threads is a highly desirable feature.


Ugh, look at all those words about noise. Sorry and regards,
Anthony.
Received on Tuesday, 18 October 2011 16:46:38 UTC