Re: Reviewing the Web Audio API (from webrtc) from Chris Rogers on 2012-03-29 (public-webrtc@w3.org from March 2012)

From: Chris Rogers <crogers@google.com>
Date: Wed, 28 Mar 2012 19:33:01 -0700
To: "Wei, James" <james.wei@intel.com>
Cc: "public-audio@w3.org" <public-audio@w3.org>, public-webrtc@w3.org
Message-ID: <CA+EzO0kSnM3T-E3OZOaoL9N0RnxaDVy0f1W4KBtuH1LtoD0HtA@mail.gmail.com>
Hi Stefan, thanks for you comments.  I'll try to give some answers inline
below.

For reference, please refer to this provisional document describing
scenarios for WebRTC / WebAudio interaction.
This is the same one that I posted to the list earlier last year and
presented to the WebRTC working group at TPAC 2011:
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/webrtc-integration.html

As we discussed at TPAC, this document would need some small changes to be
able to handle MediaStreamTrack objects explicitly instead of MediaStreams.
 Please note that these extensions to the Web Audio API (from
the webrtc-integration  document above) are not *yet* in the main Web Audio
API specification document.  I would like some help from the WebRTC group
to help finish this part of the API (from the webrtc-integration.html
document above).

Please see rest of comments inline below.

Thanks,
Chris

On Wed, Mar 28, 2012 at 7:05 PM, Wei, James <james.wei@intel.com> wrote:

>  From: Stefan Hakansson LK <stefan.lk.hakansson@ericsson.com> ****
>
> Date: Wed, 28 Mar 2012 20:54:38 +0200****
>
> Message-ID: <4F735E6E.7070706@ericsson.com> ****
>
> To: "public-webrtc@w3.org" <public-webrtc@w3.org> ****
>
> I've spent some time looking at the Web Audio API [1] since the Audio WG *
> ***
>
> have put out a call for review [2].****
>
> ** **
>
> As starting point I used the related reqs from our use-case and req ****
>
> document [3]:****
>
> ** **
>
> F13             The browser MUST be able to apply spatialization****
>
>                     effects to audio streams.****
>
> ** **
>
> F14             The browser MUST be able to measure the level****
>
>                     in audio streams.****
>
> F15             The browser MUST be able to change the level****
>
>                     in audio streams.****
>
> ** **
>
> with the accompanying API reqs:****
>
> ** **
>
> A13             The Web API MUST provide means for the web****
>
>                     application to apply spatialization effects to****
>
>                     audio streams.****
>
> A14             The Web API MUST provide means for the web****
>
>                     application to detect the level in audio****
>
>                     streams.****
>
> A15             The Web API MUST provide means for the web****
>
>                     application to adjust the level in audio****
>
>                     streams.****
>
> A16             The Web API MUST provide means for the web****
>
>                     application to mix audio streams.****
>
> ** **
>
> Looking at the Web Audio API, and combining it with the MediaStream ****
>
> concept we use, I come to the following understanding:****
>
> ** **
>
> 1) To make the audio track(s) of MediaStream(s) available to the Web ****
>
> Audio processing blocks, the The MediaElementAudioSourceNode Interface ***
> *
>
> would be used.
>

Generally, it would be a "MediaStreamAudioSourceNode" (or perhaps
"MediaStreamTrackAudioSourceNode"), please see example 3:
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/webrtc-integration.html


> ****
>
> ** **
>
> 2) Once that is done, the audio is available to the Web Audio API ****
>
> toolbox, and anything we have requirements on can be done****
>
> ** **
>
> 3) When the processing has been done (panning, measure level, change ****
>
> level, mix) the audio would be played using an AudioDestinationNode ****
>
> Interface
>

The AudioDestinationNode represents the local client's "speakers", in
example 3 a MediaStream is created with the line:
var peer = context.createMediaStreamDestination();

to send the processed audio to the remote peer.


> ****
>
> ** **
>
> What is unclear to me at present, is how synchronization would work. So **
> **
>
> far we have been discussing in terms of that all tracks in a MediaStream *
> ***
>
> are kept in sync; but what happens when the audio tracks are routed to ***
> *
>
> another set of tools, and not played in the same (video) element as the **
> **
>
> video?
>

HTMLMediaElements already have a mechanism for synchronization using the
HTML5 MediaController API.  Live stream (live camera/mic or remote peers)
MediaStreams would maintain synchronization (I assume you mean audio/video
sync in this case).  The Web Audio API would just be used to apply effects,
not changing the synchronization.


> ****
>
> ** **
>
> Another take away is that the processing can only happen in the browser **
> **
>
> that is going to play the audio, since there is no way to go from an ****
>
> AudioNode to a MediaStream or MediaStreamTrack.
>

There are some examples showing how to go from an AudioNode to MediaStream
(needs to be developed for MediaStreamTrack).
Please look especially at the use of the createMediaStreamSource()
and createMediaStreamDestination() methods.
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/webrtc-integration.html


> ****
>
> ** **
>
> Anyone else that has looked into the Web Audio API? And any other ****
>
> conclusions?****
>
> ** **
>
> I think we should give feedback from this WG (as we have some reqs that **
> **
>
> are relevant).****
>
> ** **
>
> Br,****
>
> Stefan****
>
> ** **
>
> ** **
>
> [1] http://www.w3.org/TR/2012/WD-webaudio-20120315/ ****
>
> [2] http://lists.w3.org/Archives/Public/public-webrtc/2012Mar/0072.html **
> **
>
> [3] ****
>
>
> http://datatracker.ietf.org/doc/draft-ietf-rtcweb-use-cases-and-requirements/?include_text=1
> ****
>
> ** **
>
> Best Regards ****
>
> ** **
>
> James ****
>
> ** **
>
> ** **
>
Received on Thursday, 29 March 2012 02:33:32 UTC