- From: Chris Rogers <crogers@google.com>
- Date: Sat, 11 Feb 2012 11:24:28 -0800
- To: Michael Schöffler <michael.schoeffler@audiolabs-erlangen.de>
- Cc: public-audio@w3.org
- Message-ID: <CA+EzO0nQRZ4z1xTs1nfm9-7J_p0MNMuYQdTuyFi-nMgzKb727g@mail.gmail.com>
Hi Michael, thanks for your comments. I really appreciate your feedback! On Fri, Feb 10, 2012 at 4:53 AM, Michael Schöffler < michael.schoeffler@audiolabs-erlangen.de> wrote: > Hello everyone, > > my name is Michael Schoeffler and I'm a Ph.D. student at the AudioLabs > Erlangen. I'm currently working on a framework for signal processing > plugins. The idea is to have something similar to VST or Audio Units, but > fully web-based. The chances are not bad, that some of my co-workers will > also use this framework for developing their plugins. So maybe I'll be able > to provide a lot of feedback to this group in the near future :) > > On topic: > When I started developing with the Web Audio Data, my first thought was > "Great API, but it seems to be too high level" and my opinion hasn't > changed yet. For example the multichannel systems handling is for my > use-cases too high-level. I'd be interested in hearing more details about what limitations you're seeing in the current proposal. Aside from the panning system and speaker layouts you mention below, the API offers you direct access to every single channel with the AudioChannelSplitter and AudioChannelMerger. Please note my comment: "note: this upper limit of 6 is arbitrary and could be increased to support 7.2, and higher" > As I understand the API focuses on the "mainstream" systems like Mono, > Stereo and 5.1 and does an automatically up/downmixing on the connection > between two AudioNodes. The up-mixing is very important. It's not that uncommon for games and interactive applications to load a mixture of mono and stereo audio assets which need to be mixed and processed together seamlessly. The developer doesn't need to be bothered with the channel details, and so the "right thing just happens". If we didn't do this, then the developer wouldn't have nearly as convenient of a system and wouldn't be able to connect multiple heterogeneous sources to a filter for processing. Instead, the developer would have to worry about checking individual sources, twiddling channels, creating multiple low-level processing modules for each channel. In short, the developer would have to manage many more low-level details than is currently necessary with the API. But that doesn't mean that we sacrifice low-level control. If the developer wants, multi-channel sources can be broken down into component channels with individual processing on each channel, etc. But we don't *force* developers to work at that level. > In the source code of Google Chrome I found many terms related to this > three "mainstream" system. But I think other systems getting more > important. A lot of research is done in rendering 22.2 to 5.1, 7.1 to > stereo, 5.1 to stereo and so on. So even huge multichannel systems could be > relevant for mobile devices some time. > That's great! And I don't see why we can't support them. Please see my comments above about being able to break down multi-channel sources into component channels. Please note that the default down-mix code would not be forced on anybody wishing to expert a finer level of control. For example, in a custom and specialized 5.1 -> stereo downmix, AudioChannelSplitter can be used to access each individual channel and perform arbitrary processing to render stereo. The AudioGainNode, AudioPannerNode, BiquadFilterNode, and ConvolverNodes can be used to implement quite a rich set of down-mixing algorithms. And, if that's not enough, then a JavaScriptAudioNode could be used. > Another example would be spatialization. It is directly integrated in the > API. There are tons of approaches how convolution could be implemented: In > time-domain, frequency-domain, uniformly, non-uniformly, combined with > psychoacoustics hints,.... Each approach has its advantages and > disadvantages. I think you're mixing up concepts a little bit here. Implementing convolution with time-domain or frequency-domain algorithms is entirely an implementation detail, and does not affect the API. For example, in the ConvolverNode, an impulse response is given and the node is expected to perform the convolution which is a mathematical precise operation. Yes, internally it could be using time-domain or frequency-domain algorithms, but that doesn't change how the API appears to the developer. Convolution is a *very* widely used technique, proving its usefulness in all kinds of real-world audio processing applications: * motion major picture production * games audio * music audio production But, convolution is different from the general term "spatialization". The AudioPannerNode implements spatialization, and supports more than one algorithm: partial interface AudioPannerNode { // Panning model const unsigned short EQUALPOWER = 0; const unsigned short HRTF = 1; const unsigned short SOUNDFIELD = 2; } These algorithms are very useful, especially for games and interactive applications. But the API is certainly not locked into these models and could be extended with additional ones. It would be great to hear from you if there's a commonly used model that we're missing here. But even if we miss some in the beginning, the API is extensible with additional constants. For myself, I would not use the API function. I would build a library that > offers all the approaches. Performance is maybe a problem, but I would rely > e.g. on the WebCL WG, so that the performance argument doesn't count > anymore. > Good luck with WebCL! It *may* one day become a standard, but that doesn't appear to be the case anytime soon. I haven't even seen prototypes of useful high-performance audio systems built with WebCL, and don't believe it will be a good fit for for developing general purpose, high-quality and performant audio processing. > > Nonetheless the Web Audio API is already very useable for me. So thanks to > the guys that worked hard on it so far. > Thanks Michael, it was my intention to make it very useable and practical for real-world applications now! And I'm hearing good things from music and game developers who are using it today. Regards, Chris > > The idea of a "Level 1" specification sounds for me very interesting. > > Best Regards, > > Michael > > > > > -- > Michael Schoeffler, M.Sc. > > International Audio Laboratories Erlangen (AudioLabs) > University of Erlangen-Nuremberg & Fraunhofer IIS, Audio & Multimedia > Am Wolfsmantel 33 > 91058 Erlangen > Germany > > Tel.: +49 9131 85-20515 > Skype: michael.schoeffler > michael.schoeffler@audiolabs-erlangen.de > http://www.audiolabs-erlangen.de/ > > > > > >
Received on Saturday, 11 February 2012 19:24:58 UTC