W3C home > Mailing lists > Public > public-webrtc@w3.org > June 2011

Re: Discussion on Audio API in Audio WG

From: Jan Linden <jtlinden@google.com>
Date: Thu, 23 Jun 2011 06:01:20 -0700
Message-ID: <BANLkTinjwTEAGkTxjnJONjLW3wAC5Fw68Q@mail.gmail.com>
To: Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>
Cc: public-webrtc@w3.org
Stefan and Francois,

What I'm trying to say is that in my opinion both proposals have their
merits and it doesn't necessarily make much sense to discuss it in the
WEBRTC WG. I am confident that whatever the Audio WG comes up with address
the needs in this area and we should focus on the API for WEBRTC.

Or another way to say this. I am afraid that if the current direction of
WEBRTC (which is in an early stage and may take major turns along the way)
is used as a major argument for either proposal there is a risk that the
result is not the best solution for the most important use cases.



On Thu, Jun 23, 2011 at 2:38 AM, Stefan Håkansson LK <
stefan.lk.hakansson@ericsson.com> wrote:

> Francois, Jan, list (in role of contributor),
> From my viewpoint a problem with the second (Web Audio) proposal below is
> that it is not well integrated with existing/discussed tools. There is a
> section on how it should work with audio and video elements, but that
> section sort concedes that it is difficult to do a clean integration. How to
> work with Streams is totally unclear.
> The first proposal (Stream Processing) is integrated with Stream and works
> cleanly with audio and video elements. It can also be noted some scenarios
> (stream to peer, mixing, spatializing) match use cases in <
> http://datatracker.ietf.org/**doc/draft-holmberg-rtcweb-**
> ucreqs/?include_text=1<http://datatracker.ietf.org/doc/draft-holmberg-rtcweb-ucreqs/?include_text=1>>
> quite closely.
> So from my perspective the Stream Processing API is of more interest.
> Stefan
> On 2011-06-23 11:01, Francois Daoust wrote:
>> Hi Jan,
>> Both specs may be independent but they don't really look like they are at
>> first glance (from a non expert viewpoint at least). Confusion for
>> JavaScript developers could also emerge from having to choose between two
>> similar solutions to process sounds. Anyway, that's not really the point
>> here.
>> What I'm getting from your comment is that an API based on the Stream
>> Processing API would be more appropriate for real-time communications, as
>> it's more straightforward to deal with streams when... well... streaming to
>> some other peer. Sound processing, e.g. to cancel echo or mix sounds, would
>> still be possible with that approach. That sounds reasonable.
>> I don't know if others in this group have reviewed the specs mentioned.
>> Any other view?
>> Francois.
>> On 06/22/2011 04:28 PM, Jan Linden wrote:
>>> I have discussed this with some colleagues and our view is that the
>>> Stream Processing API and the Web Audio API are mostly independent, both
>>> have a reason to exist, and probably are better kept separate for now.
>>> Our initial implementation of a possible WEBRTC API using the
>>> PeerConnection API is based on the Stream Processing API. In the short term
>>> the Web Audio API is useful to generate ringtones etc in WEBRTC apps, and in
>>> the long run it could be good to tie them together to add sound effects to
>>> real time calls etc - but that is not a basic use case for neither WEBRTC or
>>> the Web Audio API.
>>> We think it would create quite a mess trying to combine them into some
>>> kind of mega-API which would probably solve neither of these problem domains
>>> very well, which may also create unnecessary confusion for JavaScript
>>> developers.
>>> Jan.
>>> On Fri, Jun 17, 2011 at 3:19 AM, Francois Daoust<fd@w3.org<mailto:fd@w3.
>>> **org <fd@w3.org>>>  wrote:
>>>     On 06/15/2011 05:14 PM, Harald Alvestrand wrote:
>>>         Team,
>>>         as stated in our charter, our deliverables say:
>>>         The working group will deliver specifications that cover at least
>>> the
>>>         following functions, unless they are found to be fully specified
>>>         within other working groups' finished results:
>>>         Media Stream Functions - API functions to manipulate media
>>> streams for
>>>         interactive real-time communications, connecting various
>>> processing
>>>         functions to each other, and to media devices and network
>>> connections,
>>>         including media manipulation functions for e.g. allowing to
>>>         synchronize streams.
>>>         Audio Stream Functions - An extension of the Media Stream
>>> Functions to
>>>         process audio streams, to enable features such as automatic gain
>>>         control, mute functions and echo cancellation.
>>>     There is an active discussion in the Audio WG right now [1] around
>>> what document to use as initial input for the Audio WG. Two proposals are
>>> discussed for the time being:
>>>     1/ The Stream Processing API
>>>       proposed by Robert O'Callahan, Mozilla
>>>     http://hg.mozilla.org/users/__**rocallahan_mozilla.com/specs/_**
>>> _raw-file/tip/**StreamProcessing/__**StreamProcessing.html<http://hg.mozilla.org/users/__rocallahan_mozilla.com/specs/__raw-file/tip/StreamProcessing/__StreamProcessing.html>
>>> <http://**hg.mozilla.org/users/**rocallahan_mozilla.com/specs/**
>>> raw-file/tip/StreamProcessing/**StreamProcessing.html<http://hg.mozilla.org/users/rocallahan_mozilla.com/specs/raw-file/tip/StreamProcessing/StreamProcessing.html>
>>> >
>>>     It is based on the Stream API as defined in WHATWG, extended to cover
>>> more requirements:
>>>     http://www.whatwg.org/specs/__**web-apps/current-work/webrtc._**
>>> _html#stream-api<http://www.whatwg.org/specs/__web-apps/current-work/webrtc.__html#stream-api>
>>> <http://www.**whatwg.org/specs/web-apps/**current-work/webrtc.html#**
>>> stream-api<http://www.whatwg.org/specs/web-apps/current-work/webrtc.html#stream-api>
>>> >
>>>     2/ the Web Audio API
>>>       proposed by Chris Rogers, Google
>>>     http://chromium.googlecode.__**com/svn/trunk/samples/audio/__**
>>> specification/specification.__**html#AudioDestinationNode-__**section<
>>> http://chromium.**googlecode.com/svn/trunk/**
>>> samples/audio/specification/**specification.html#**
>>> AudioDestinationNode-section<http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#AudioDestinationNode-section>
>>> >
>>>     This proposal designs a modular audio processing graph that allows
>>> "connecting various processing functions to each other" to quote the charter
>>> extract Harald mentioned.
>>>     Both proposals cover at least some of our needs.
>>>     Francois.
>>>     [1] see thread starting at: http://lists.w3.org/Archives/_**
>>> _Public/public-audio/__**2011AprJun/0102.html<http://lists.w3.org/Archives/__Public/public-audio/__2011AprJun/0102.html>
>>> <http://**lists.w3.org/Archives/Public/**public-audio/2011AprJun/0102.**
>>> html<http://lists.w3.org/Archives/Public/public-audio/2011AprJun/0102.html>
>>> >
>>> --
>>> Jan Linden, PM WebRTC
>>> Google Voice: (415) 690-7610

Jan Linden, PM WebRTC
Google Voice: (415) 690-7610
Received on Thursday, 23 June 2011 13:01:45 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:17:20 UTC