Re: Discussion on Audio API in Audio WG from Francois Daoust on 2011-06-23 (public-webrtc@w3.org from June 2011)

From: Francois Daoust <fd@w3.org>
Date: Thu, 23 Jun 2011 11:01:30 +0200
To: Jan Linden <jtlinden@google.com>
CC: Harald Alvestrand <harald@alvestrand.no>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <4E0300EA.2040300@w3.org>
Hi Jan,

Both specs may be independent but they don't really look like they are at first glance (from a non expert viewpoint at least). Confusion for JavaScript developers could also emerge from having to choose between two similar solutions to process sounds. Anyway, that's not really the point here.

What I'm getting from your comment is that an API based on the Stream Processing API would be more appropriate for real-time communications, as it's more straightforward to deal with streams when... well... streaming to some other peer. Sound processing, e.g. to cancel echo or mix sounds, would still be possible with that approach. That sounds reasonable.

I don't know if others in this group have reviewed the specs mentioned. Any other view?

Francois.


On 06/22/2011 04:28 PM, Jan Linden wrote:
> I have discussed this with some colleagues and our view is that the Stream Processing API and the Web Audio API are mostly independent, both have a reason to exist, and probably are better kept separate for now.
>
> Our initial implementation of a possible WEBRTC API using the PeerConnection API is based on the Stream Processing API. In the short term the Web Audio API is useful to generate ringtones etc in WEBRTC apps, and in the long run it could be good to tie them together to add sound effects to real time calls etc - but that is not a basic use case for neither WEBRTC or the Web Audio API.
>
> We think it would create quite a mess trying to combine them into some kind of mega-API which would probably solve neither of these problem domains very well, which may also create unnecessary confusion for JavaScript developers.
>
> Jan.
>
> On Fri, Jun 17, 2011 at 3:19 AM, Francois Daoust <fd@w3.org <mailto:fd@w3.org>> wrote:
>
>     On 06/15/2011 05:14 PM, Harald Alvestrand wrote:
>
>         Team,
>
>         as stated in our charter, our deliverables say:
>
>         The working group will deliver specifications that cover at least the
>         following functions, unless they are found to be fully specified
>         within other working groups' finished results:
>
>         Media Stream Functions - API functions to manipulate media streams for
>         interactive real-time communications, connecting various processing
>         functions to each other, and to media devices and network connections,
>         including media manipulation functions for e.g. allowing to
>         synchronize streams.
>         Audio Stream Functions - An extension of the Media Stream Functions to
>         process audio streams, to enable features such as automatic gain
>         control, mute functions and echo cancellation.
>
>
>     There is an active discussion in the Audio WG right now [1] around what document to use as initial input for the Audio WG. Two proposals are discussed for the time being:
>
>     1/ The Stream Processing API
>       proposed by Robert O'Callahan, Mozilla
>     http://hg.mozilla.org/users/__rocallahan_mozilla.com/specs/__raw-file/tip/StreamProcessing/__StreamProcessing.html <http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html>
>     It is based on the Stream API as defined in WHATWG, extended to cover more requirements:
>     http://www.whatwg.org/specs/__web-apps/current-work/webrtc.__html#stream-api <http://www.whatwg.org/specs/web-apps/current-work/webrtc.html#stream-api>
>
>     2/ the Web Audio API
>       proposed by Chris Rogers, Google
>     http://chromium.googlecode.__com/svn/trunk/samples/audio/__specification/specification.__html#AudioDestinationNode-__section <http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#AudioDestinationNode-section>
>     This proposal designs a modular audio processing graph that allows "connecting various processing functions to each other" to quote the charter extract Harald mentioned.
>
>     Both proposals cover at least some of our needs.
>
>     Francois.
>
>     [1] see thread starting at: http://lists.w3.org/Archives/__Public/public-audio/__2011AprJun/0102.html <http://lists.w3.org/Archives/Public/public-audio/2011AprJun/0102.html>
>
>
>
>
> --
> Jan Linden, PM WebRTC
> Google Voice: (415) 690-7610
Received on Thursday, 23 June 2011 09:02:03 UTC