Re: Newbie questions about web audio working group specs from Chris Rogers on 2012-01-31 (public-audio@w3.org from January to March 2012)

From: Chris Rogers <crogers@google.com>
Date: Tue, 31 Jan 2012 14:28:53 -0800
To: Samuel Goldszmidt <samuel.goldszmidt@ircam.fr>
Cc: public-audio@w3.org
Message-ID: <CA+EzO0=8F06V=Bc6TuXqUjT_J4SvM-UgfxiCxQRAT3hJC9sW1Q@mail.gmail.com>
On Mon, Jan 30, 2012 at 10:04 AM, Samuel Goldszmidt <
samuel.goldszmidt@ircam.fr> wrote:

>  Hi all,
>
> Here are some comments and questions about web audio working group spec
> that I would like to share and discuss with you.
>       I hope not to have made too many misinterpretations of the
> specifications and, therefore, feel free to correct me where I
> misunderstood.
>
> (This is my first post here. I work at Ircam, which is, in part, a
> scientific institute where we do research on audio [
> http://www.ircam.fr/recherche.html?L=1].
> I'm a multimedia/web engineer, and, for some experimental projects, I use
> audio tag and HTML5.
> For research projects and integration purposes , I have to go a step
> further, and I read with attention both of the two API proposals.)
>
> Concerning Web Audio API by Chris Rogers:
>
> I see some kind of connections with graphical audio programming tools like
> PureData or Max/MSP, 'without the interface' (which in my own opinion is
> great).
> Have you experienced with these kind of tools ? (These are specially
> design for real time audio processing).
>

Hi Samuel, thanks for having a look at the specification!  I used to work
at IRCAM, where I designed AudioSculpt, and also worked on SVP, Chant, etc.
 I'm very familiar with tools like PureData and Max/MSP and even was at
IRCAM during the same time that Miller Puckette was doing real-time work
with the ISPW platform and Max.  I've spent most of my career working on
graph-based audio architecture and DSP.


>
> Concerning the MediaStream Processing API by Robert O'Callahan:
>
> First, you talk about *continuous real-time* media. At Ircam, we work on
> these questions, and, may be the *real time* word, is to restrictive, or
> may be we don't talk about the same thing. Sometimes, audio
> processing/treatments can't be done in real time:
> * some analysis/treatments can be performed faster than real time, for
> instance, spectral/waveform representations (which are in Use Cases)
> * in the opposite direction, some treatments can't be done real time, for
> instance,  you can't make an algorithm which 'mute the sound when the user
> reaches it middle length', if you don't know the length of the sound
> because it's played live (do you follow me ?). Sometimes, we need to
> perform action in 'delayed time'. That's why I don't understand here the
> importance of the term 'real time'.
>
> I agree with the fact that named effect should be at 'level 2'
> specification. I think that there is no effect ontology that everybody is
> agree with, so one important thing is to have a 'generic template' for
> effect/treatment/processing sound. For example, we could have more than
> just one algorithm to program a reverb and it would be great to be able to
> have these algorithms as 'AudioNode' javascript availables (We could also
> have audio engines with different implementations in JavaScript).
>
> For spatialization effects, I don't know how the number of output channel
> could be taken in consideration. Two points I'd like to discuss with you:
> * the possibility to have, on a device "a more that just stereo
> restitution" which depends on the hardware connected,
> * maybe a use case, that, in the manner of MediaQueries, could adapt the
> audio restitution to the device (how many channels, headphones or speaker
> ...)
>
> "To avoid interruptions due to script execution, script execution can
> overlap with media stream processing;", is it the fact that we could here
> deal not only with a sort of asynchronous processing (worker) but have a
> 'rendering process' that walk through the entire file, and other process
> that use a 'delayed time' ?
>
> (One last question: for mediastream extensions, as for effects that would
> be in level 2 specification, wouldn't it be better to have an overall
> createProcessor method for both workers and non-workers processor ?)
>
> Finally, correct me if I'm wrong, the main difference I have seen between
> Web Audio API by Chris Rogers and MediaStream Procession API by Robert
> O'Callahan is that in the second, all media processing are more linked with
> DOM objects (media elements in this case) than in the first one (although
> the graph of the first API seems to me much more easy to understand at
> first time), which make sense in my point of view.
>

The Web Audio API also has a relationship with HTMLMediaElement,
implemented as MediaElementAudioSourceNode.  You can see an example of its
use using the createMediaElementSource() method in this section:
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#DynamicLifetime-section

There's also an initial proposal for integration with the WebRTC API:
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/webrtc-integration.html

Which I presented to the WebRTC working group at the 2011 TPAC meeting.  At
the meeting we discussed some details, like how this proposal could be
further refined using MediaStreamTracks.

Cheers,
Chris
Received on Tuesday, 31 January 2012 22:29:34 UTC