- From: Steven Yi <stevenyi@gmail.com>
- Date: Sat, 31 Jan 2015 01:57:07 +0100
- To: public-audio@w3.org
Hello All, First, it was a great pleasure to be at the Web Audio conference. I enjoyed the sessions and gigs and getting to the meet the other members of community that I did. Cheers to IRCAM and Mozilla for the lovely conference! That said, I have some comments and questions about the Web Audio API and specification. (Note: these comments are in reference to the 06 January 2015 draft, found at http://webaudio.github.io/web-audio-api/.) #1 - The specification is not clear to me when a node become live. I assume it is when a node is connected to the active part of the audio graph that is "live" and processing. Since node creation and graph assembly is done in the JS Main thread, it seems that the following from "3.3 Example: Mixer with Send Busses", it possible that nodes might get attached across buffers in the audio thread: compressor = context.createDynamicsCompressor(); // Send1 effect reverb = context.createConvolver(); // Convolver impulse response may be set here or later // Send2 effect delay = context.createDelay(); // Connect final compressor to final destination compressor.connect(context.destination); // Connect sends 1 & 2 through effects to main mixer s1 = context.createGain(); reverb.connect(s1); s1.connect(compressor); s2 = context.createGain(); delay.connect(s2); s2.connect(compressor); For example, could it be the case that "s1.connect(compresor)" above happens just before buffer n starts to generate, and "s2.connect(compressor)" happens such that it starts in when buffer n + 1 is generating? If this is the case, would connecting the compressor to the context.destination at the end of the example, rather than the beginning, guarantee that the graph of nodes connected to the compressor are started at the same time? If so, then maybe this aspect of node graph creation could be clarified and the example in 3.3 updated so that the sub-graph of nodes is clearly formed before attaching to the active audio-graph. #2 - Following from #1, what would happen if one is dynamically altering a graph to remove an intermediary node? For example, lets say one has a graph like: gain = contxt.createGainNode(); compressor = context.createDynamicsCompressor(); reverb = context.createConvolver(); gain.connect(reverb); reverb.connect(compressor); compressor.connect(context.destination); and later the user decides to remove the reverb with something like: reverb.disconnect(); // gain.disconnect(); gain.connect(compressor); (Assuming the above uses a gain node as a stable node for other nodes to attach to.) My question is: when does connect and disconnect happen? Does it happen at block boundaries? I assume it must or a graph can get in a bad state if the graph changes while a block is being processed. Also, without the gain.disconnect(), will there be a hidden reference to the reverb from gain? (I guess a "connection" reference according to 2.3.3). If so, this seems like it could be a source of a memory leak (assuming that the above object references to reverb are all cleared from the JS main thread side). #3 - In "2.3.2 Methods", for an AudioNode to connect to another audio node, it is not clear whether fan-out/fan-in is supported. The documentation for connecting to AudioParams explicitly states that this is supported. Should the first connect() method documentation be clarified for this when connecting to nodes? #4 - Also in regards to 2.3.2, the API of disconnect() seems odd as it does not mirror connect(). connect() is given an argument of what node or audioParam to connect to. disconnect() however does not have a target argument. It's not clear then what this disconnects from. For example, if I connect a node to two different nodes and also to another node's parameter, then call disconnect, what happens? As it is now, it doesn't seem possible then to create a GUI editor where one could connect the output of a node to multiple other nodes/params, then click and disconnect a single connection. #5 - In the music systems I've seen, event processing is done within the audio-thread. This generally happens for each buffer, something like: 1. Process incoming messages 2. Process a priority queue of pending events 3. Handle audio input 4. Run processing graph for one block 5. Handle audio output I'm familiar with this from Csound and SuperCollider's engines, as well as the design in my own software synthesizer Pink. (Chuck's design follow the same basic pattern above, but on a sample-by-sample basis.) As it is today, the Web Audio API does not have any kind of reified event object. One can schedule some things like automations via param's setXXXatTime() methods and have that run within the time of the audio engine, but there is nothing built-in for events in the Web Audio API. Now, I have no issues with the Web Audio API not having a concrete event system, and think it should not have one, as people have different notions and needs out of what is encoded in an event. However, I think that there should be a way to create one's own event system, one that is clocked to the same audio system clock (i.e. run within the audio thread). I was a bit concerned when at the conference there was mention of "A Tale of Two Clocks". The design of trying to reference two clocks can not, by definition, allow for a queue of events to be processed synchronously with audio. If one formalizes events processing functions and audio processing functions as functions of time, by having two clocks you get two different variables, ta and tb, which are not equivalent unless the clocks are proven to advance at the same exact rate (i.e. ta0 == tb0, ta1 == tb1, ... tan == tbn). However, the JS Main thread and audio thread are not run at the same rate, so we can at best implement some kind of approximation, but it can not be a formally correct solution. Event processing in a thread other than the audio thread has problems. One mentioned at the conference was what to do with offline rendering, where the clock of an audio engine runs faster than realtime, and may advance faster or slower in terms of wall-clock time while rendering, depending on how heavy the processing needs of the graph is. Second, I seemed to remember hearing a problem during one of the concerts when I turned off my phone's screen and I continued to hear audio but all events stopped, then a number of events fired all at once when I turned my screen back on. The piece used an event scheduling system that ran in the JS Main thread. I assume this situation is similar to what could happen with backgrounded tabs, but I'm not quite sure about all this. Either way, I think there are real problems here that need to be addressed. This also leads to a bigger question: with Web Audio, if I run the same project twice that uses an event system to reify graph modifications in time (as events in audio engines are mostly used for, i.e. alloc this graph of nodes and add to the live audio graph), will I get the same result? Assuming to use only referentially transparent nodes (i.e. no random calculations), I believe the only way to guarantee this is if the event system is processed as part of the audio thread. Now, what can a user do with Web Audio to create their own Event system that is in sync with the audio thread? Currently, there is the ScriptProcessorNode. Of course, the design of ScriptProcessorNode is deeply flawed for all the reasons discussed at the conference (Security, Inefficient due to context switching, potential for breakups, etc.). However, what it does do is allow for one to process events in sync with the audio thread, allowing to build formally correct audio systems where one processes event time according to the same time as is used by the audio nodes. Additionally, according to those events, one can dynamically modify the graph (i.e. add new instances of a sub-graph of nodes to the live graph, representing a "note"), via reference to other nodes and the audio context. So while flawed in terms of performance and security, it does allow one to build correct systems that generate consistent output. My concern is that there was discussion of not only deprecating ScriptProcessorNode, but removing it altogether. I would have no problems with this, except that from reading the current specification for AudioWorker, I do not see how it would be possible to create an event system with it. While one can pass messages to and from an AudioWorker, one has no access to the AudioContext. In that regards, one can not say, within an AudioWorker, create new nodes and attach to the context.destination. I am not very familiar with transferables and what can be passed between the AudioWork and the JS Main thread via postMessage, but I assume AudioNodes can not be made transferable. At this point, I'm questioning what can be done. It seems AudioWorker's design is not meant for event processing (fair enough), and ScriptProcessor can only do this by accident and not design. Is there any solution to this problem with the Web Audio API moving forward? For example, would this group be willing to consider extending the API for non-audio nodes? (Processing nodes?). If processing nodes could be added that has a larger context than what is proposed for AudioWorkGlobalContext--say, has access to the AudioContext, and can modify the audio node graph dynamically--I could see it as a solution to allow building higher level constructs like an event system. #6 - For the AudioWorker specification, I think it would be useful to have clarification on when postMessage is processed. In 2.11.1.2, it has a link to "the algorithm defined by the Worker Specification". That in turn mentions: "The postMessage() method on DedicatedWorkerGlobalScope objects must act as if, when invoked, it immediately invoked the method of the same name on the port, with the same arguments, and returned the same return value." If it meant to be processed immediately, then this can cause problems if the AudioWorker is already part of a live graph and values mutate while an audio worker is processing a block. I think it would be good to have clarification on this, perhaps with a recommendation that in onaudioprocess functions, one should make a local copy of a value of a mutable value and use that for the duration of onaudioprocess to get a consistent result for the block. #7 - Related to #6, I noticed in "2.11.3.1 A Bitcrusher Node", the example uses a phaser variable that is scoped to the AudioWorker. I assume this would then be on the heap. This is perhaps more of general JS question, but I normally see in block-based audio programming that for a process() function, one generally copies any state variables of a node/ugen/etc. to local variables, runs the audio for-loop with local variable, then saves the state for the next run. This is done for performance (better locality, stack vs. heap access, better compiler optimizations, etc.). I don't know much about JavaScript implementations; can anyone comment if these kinds of optimizations are effective in JS? If so, the example might benefit from rewriting and give some guidance. (i.e. phase and lastDataValue are copied to a local var before the for-loop, and saved again after the for-loop, in onaudioprocess). Thanks! steven
Received on Monday, 2 February 2015 15:30:04 UTC