Re: TPAC F2F and Spec Proposals from Jussi Kalliokoski on 2011-10-18 (public-audio@w3.org from October to December 2011)

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Tue, 18 Oct 2011 20:08:25 +0300
To: Anthony Bowyer-Lowe <anthony@lowbroweye.com>
Cc: public-audio@w3.org
Message-ID: <CAJhzemWn52t3-yoo1pKa=Nygh88oTtGFUOU-65aNQG3Ca2rw7g@mail.gmail.com>
On Tue, Oct 18, 2011 at 7:45 PM, Anthony Bowyer-Lowe <anthony@lowbroweye.com
> wrote:

> Hi Jussi,
>
>
>> There are few differences for the processing audio use cases such as
>>> providing spectral visualisations of playing music, or echo cancellation of
>>> webcam calls. The MediaStream API is very satisfactory for this.
>>>
>>> However, for the purposes of realtime synthesis & sample playback and
>>> videogame audio feedback, where low latency, low jitter sound
>>> generation/triggering and direct user interaction are required then the Web
>>> Audio API's focus upon canonical audio formats and strong timing make it far
>>> more useful than the MediaStream API which offers none of these
>>> capabilities.
>>>
>>> I beg to differ here, as the media stream API offers the sample level
>> access, so there are no boundaries per se, other than those of hardware
>> limitations and JS speed, take a look at http://jams.no.de , it's pure JS
>> generated music with no native effects whatsoever, made to work with the
>> current Audio Data API and Web Audio API. It works quite seamlessly
>> especially considering it's running the graphics and audio in the same
>> thread, which is something you really shouldn't do. This is in my opinion
>> actually a flaw in the current implementations of the aforementioned APIs,
>> you can't take the audio processing into another thread, but that is
>> possible, and in fact the only way, in Media Streams API. The example demo
>> could quite easily be changed to work with Media Streams API, thus making it
>> much more efficient.
>>
>
> Absolutely! Anything that provides direct sample buffer access can of
> course be used to generate audio given appropriate levels of development
> effort as you've so effectively proven. I'm no stranger myself to building
> entire audio systems up from the lowest hardware levels and it's likely that
> the audio projects I produce in-browser will forego most of the native
> effects for my own custom implementations and I've certainly worked against
> worse APIs than that of MediaStream.
>
> However, simply having access to some basic native helpers for stream
> summing, panning and convolution will greatly lighten the CPU expense of
> doing that work in JavaScript. Sure, these could be considered minor
> optimisations but every cycle counts in realtime audio, particularly on
> mobile devices, and achieving 40ms latency instead of 50ms, or 6 voices
> instead of 5 is hugely significant when it comes to instrument synthesis. As
> you say, the boundaries are hardware limitations and JS speed so reductions
> in those needs are desirable as the gains can be reinvested in the primary
> goal: generating great sound.
>
> And whilst they are little more than underspecified side notes at present,
> that the Web Audio spec mentions considerations for preflighting and CPU
> usage monitoring gives me solace that it will be an appropriate platform for
> extensible future development. Do you think the MediaStream API will expand
> to support such needs given it's stated focus? Letting users suffer glitches
> and telling them to upgrade is not an acceptable user experience in my book.
>
> Sure, devices and JavaScript VMs are continually getting faster but I do
> find it frustratingly funny that our modern browsers running on hardware
> that can support DAW processing power unimaginable in even the past decade
> still can't approach the realtime sound synthesis capabilities, low
> latencies, and sonic qualities of 30 year old microcomputers. To be honest,
> I'm API agnostic: I just want the opportunity to deliver cool realtime
> interactive audio systems in-browser that meet my personal performance
> quality criteria.
>
> Anyway, I wholeheartedly agree that having the proposed APIs support audio
> processing in non-UI threads is a highly desirable feature.
>
>
> Ugh, look at all those words about noise. Sorry and regards,
> Anthony.
>

I'm along the same lines as you are! I'm also API agnostic, I just want to
do cool stuff, but to be on the safe side, I'm an advocate of the solution
that lets you do as much as possible. I've been touting here about the thing
that if stuff is slow, that will get fixed sooner or later, but I wouldn't
bet on the same for APIs, they change slowly. To quote @foolip, we only get
one chance to do this right. :) For me the native effects are a nice to have
plus, that will indeed make our lives easier. Let's just make sure as much
is possible without these effects. Aaanyway, noise, heh.

Jussi
Received on Tuesday, 18 October 2011 17:08:52 UTC