- From: Samuel Goldszmidt <samuel.goldszmidt@ircam.fr>
- Date: Mon, 30 Jan 2012 19:04:31 +0100
- To: public-audio@w3.org
- Message-ID: <4F26DBAF.6080902@ircam.fr>
Hi all, Here are some comments and questions about web audio working group spec that I would like to share and discuss with you. I hope not to have made too many misinterpretations of the specifications and, therefore, feel free to correct me where I misunderstood. (This is my first post here. I work at Ircam, which is, in part, a scientific institute where we do research on audio [http://www.ircam.fr/recherche.html?L=1]. I'm a multimedia/web engineer, and, for some experimental projects, I use audio tag and HTML5. For research projects and integration purposes , I have to go a step further, and I read with attention both of the two API proposals.) Concerning Web Audio API by Chris Rogers: I see some kind of connections with graphical audio programming tools like PureData or Max/MSP, 'without the interface' (which in my own opinion is great). Have you experienced with these kind of tools ? (These are specially design for real time audio processing). Concerning the MediaStream Processing API by Robert O'Callahan: First, you talk about *continuous real-time* media. At Ircam, we work on these questions, and, may be the *real time* word, is to restrictive, or may be we don't talk about the same thing. Sometimes, audio processing/treatments can't be done in real time: * some analysis/treatments can be performed faster than real time, for instance, spectral/waveform representations (which are in Use Cases) * in the opposite direction, some treatments can't be done real time, for instance, you can't make an algorithm which 'mute the sound when the user reaches it middle length', if you don't know the length of the sound because it's played live (do you follow me ?). Sometimes, we need to perform action in 'delayed time'. That's why I don't understand here the importance of the term 'real time'. I agree with the fact that named effect should be at 'level 2' specification. I think that there is no effect ontology that everybody is agree with, so one important thing is to have a 'generic template' for effect/treatment/processing sound. For example, we could have more than just one algorithm to program a reverb and it would be great to be able to have these algorithms as 'AudioNode' javascript availables (We could also have audio engines with different implementations in JavaScript). For spatialization effects, I don't know how the number of output channel could be taken in consideration. Two points I'd like to discuss with you: * the possibility to have, on a device "a more that just stereo restitution" which depends on the hardware connected, * maybe a use case, that, in the manner of MediaQueries, could adapt the audio restitution to the device (how many channels, headphones or speaker ...) "To avoid interruptions due to script execution, script execution can overlap with media stream processing;", is it the fact that we could here deal not only with a sort of asynchronous processing (worker) but have a 'rendering process' that walk through the entire file, and other process that use a 'delayed time' ? (One last question: for mediastream extensions, as for effects that would be in level 2 specification, wouldn't it be better to have an overall createProcessor method for both workers and non-workers processor ?) Finally, correct me if I'm wrong, the main difference I have seen between Web Audio API by Chris Rogers and MediaStream Procession API by Robert O'Callahan is that in the second, all media processing are more linked with DOM objects (media elements in this case) than in the first one (although the graph of the first API seems to me much more easy to understand at first time), which make sense in my point of view. For the moment, I didn't read all the mailing-list, and I'm going to do it. Regards, Samuel Goldszmidt
Received on Tuesday, 31 January 2012 12:15:35 UTC