- From: Steven Yi <stevenyi@gmail.com>
- Date: Mon, 2 Feb 2015 12:30:21 -0500
- To: Joseph Berkovitz <joe@noteflight.com>
- Cc: public-audio@w3.org
Hi Joe, I've added some points to that issue to include comments and references. Thanks! steven On Mon, Feb 2, 2015 at 11:57 AM, Joseph Berkovitz <joe@noteflight.com> wrote: > Thanks Steven. > > Could you please post your comment about jitter and the Brandt/Dannenberg > reference to the issue thread? Those are valuable points. > > Best, > …Joe > > On Feb 2, 2015, at 11:49 AM, Steven Yi <stevenyi@gmail.com> wrote: > > Hi Joe, > > Thank you for filing that issue, I have subscribed to it and will try > to add some notes to it. Regarding #5, I should have noted that I > think the "Two Clocks" design with ahead of time scheduling of a > partial set of events is completely valid for a number of realtime > uses cases. (I believe it's the same as Brandt and Dannenberg's > "Forward Synchronous Model", as discussed in [1]). However, I think > the design also best works when some expectations can be made about > how bounded the jitter can be, which with the JS Main thread seems > very difficult. > > Thanks! > steven > > > [1] - "Time in Distributed Real-Time Systems", Eli Brandt and Roger B. > Dannenberg. 1999. Available at: > http://www.cs.cmu.edu/~rbd/papers/synchronous99/synchronous99.pdf > > > > On Mon, Feb 2, 2015 at 11:09 AM, Joseph Berkovitz <joe@noteflight.com> > wrote: > > Hi Steven, > > Many points here worth responding to, but I will just reference your #5 > since it was also raised by some other people at the WAC and I think it is > an important issue for both realtime and offline audio rendering. > > Please see the newly filed > https://github.com/WebAudio/web-audio-api/issues/473 > > …Joe > > > > On Jan 30, 2015, at 7:57 PM, Steven Yi <stevenyi@gmail.com> wrote: > > Hello All, > > First, it was a great pleasure to be at the Web Audio conference. I > enjoyed the sessions and gigs and getting to the meet the other > members of community that I did. Cheers to IRCAM and Mozilla for the > lovely conference! > > That said, I have some comments and questions about the Web Audio API > and specification. (Note: these comments are in reference to the 06 > January 2015 draft, found at > http://webaudio.github.io/web-audio-api/.) > > #1 - The specification is not clear to me when a node become live. I > assume it is when a node is connected to the active part of the audio > graph that is "live" and processing. Since node creation and graph > assembly is done in the JS Main thread, it seems that the following > from "3.3 Example: Mixer with Send Busses", it possible that nodes > might get attached across buffers in the audio thread: > > compressor = context.createDynamicsCompressor(); > > // Send1 effect > reverb = context.createConvolver(); > // Convolver impulse response may be set here or later > > // Send2 effect > delay = context.createDelay(); > > // Connect final compressor to final destination > compressor.connect(context.destination); > > // Connect sends 1 & 2 through effects to main mixer > s1 = context.createGain(); > reverb.connect(s1); > s1.connect(compressor); > > s2 = context.createGain(); > delay.connect(s2); > s2.connect(compressor); > > For example, could it be the case that "s1.connect(compresor)" above > happens just before buffer n starts to generate, and > "s2.connect(compressor)" happens such that it starts in when buffer n > + 1 is generating? > > If this is the case, would connecting the compressor to the > context.destination at the end of the example, rather than the > beginning, guarantee that the graph of nodes connected to the > compressor are started at the same time? If so, then maybe this > aspect of node graph creation could be clarified and the example in > 3.3 updated so that the sub-graph of nodes is clearly formed before > attaching to the active audio-graph. > > #2 - Following from #1, what would happen if one is dynamically > altering a graph to remove an intermediary node? For example, lets > say one has a graph like: > > gain = contxt.createGainNode(); > compressor = context.createDynamicsCompressor(); > reverb = context.createConvolver(); > gain.connect(reverb); > reverb.connect(compressor); > compressor.connect(context.destination); > > and later the user decides to remove the reverb with something like: > > reverb.disconnect(); > // gain.disconnect(); > gain.connect(compressor); > > (Assuming the above uses a gain node as a stable node for other nodes > to attach to.) My question is: when does connect and disconnect > happen? Does it happen at block boundaries? I assume it must or a > graph can get in a bad state if the graph changes while a block is > being processed. > > Also, without the gain.disconnect(), will there be a hidden reference > to the reverb from gain? (I guess a "connection" reference according > to 2.3.3). If so, this seems like it could be a source of a memory > leak (assuming that the above object references to reverb are all > cleared from the JS main thread side). > > #3 - In "2.3.2 Methods", for an AudioNode to connect to another audio > node, it is not clear whether fan-out/fan-in is supported. The > documentation for connecting to AudioParams explicitly states that > this is supported. Should the first connect() method documentation be > clarified for this when connecting to nodes? > > #4 - Also in regards to 2.3.2, the API of disconnect() seems odd as it > does not mirror connect(). connect() is given an argument of what node > or audioParam to connect to. disconnect() however does not have a > target argument. It's not clear then what this disconnects from. For > example, if I connect a node to two different nodes and also to > another node's parameter, then call disconnect, what happens? As it > is now, it doesn't seem possible then to create a GUI editor where one > could connect the output of a node to multiple other nodes/params, > then click and disconnect a single connection. > > #5 - In the music systems I've seen, event processing is done within > the audio-thread. This generally happens for each buffer, something > like: > > 1. Process incoming messages > 2. Process a priority queue of pending events > 3. Handle audio input > 4. Run processing graph for one block > 5. Handle audio output > > I'm familiar with this from Csound and SuperCollider's engines, as > well as the design in my own software synthesizer Pink. (Chuck's > design follow the same basic pattern above, but on a sample-by-sample > basis.) > > As it is today, the Web Audio API does not have any kind of reified > event object. One can schedule some things like automations via > param's setXXXatTime() methods and have that run within the time of > the audio engine, but there is nothing built-in for events in the Web > Audio API. > > Now, I have no issues with the Web Audio API not having a concrete > event system, and think it should not have one, as people have > different notions and needs out of what is encoded in an event. > However, I think that there should be a way to create one's own event > system, one that is clocked to the same audio system clock (i.e. run > within the audio thread). > > I was a bit concerned when at the conference there was mention of "A > Tale of Two Clocks". The design of trying to reference two clocks can > not, by definition, allow for a queue of events to be processed > synchronously with audio. If one formalizes events processing > functions and audio processing functions as functions of time, by > having two clocks you get two different variables, ta and tb, which > are not equivalent unless the clocks are proven to advance at the same > exact rate (i.e. ta0 == tb0, ta1 == tb1, ... tan == tbn). However, > the JS Main thread and audio thread are not run at the same rate, so > we can at best implement some kind of approximation, but it can not be > a formally correct solution. > > Event processing in a thread other than the audio thread has problems. > One mentioned at the conference was what to do with offline rendering, > where the clock of an audio engine runs faster than realtime, and may > advance faster or slower in terms of wall-clock time while rendering, > depending on how heavy the processing needs of the graph is. Second, > I seemed to remember hearing a problem during one of the concerts when > I turned off my phone's screen and I continued to hear audio but all > events stopped, then a number of events fired all at once when I > turned my screen back on. The piece used an event scheduling system > that ran in the JS Main thread. I assume this situation is similar to > what could happen with backgrounded tabs, but I'm not quite sure about > all this. Either way, I think there are real problems here that need > to be addressed. > > This also leads to a bigger question: with Web Audio, if I run the > same project twice that uses an event system to reify graph > modifications in time (as events in audio engines are mostly used for, > i.e. alloc this graph of nodes and add to the live audio graph), will > I get the same result? Assuming to use only referentially transparent > nodes (i.e. no random calculations), I believe the only way to > guarantee this is if the event system is processed as part of the > audio thread. > > Now, what can a user do with Web Audio to create their own Event > system that is in sync with the audio thread? Currently, there is the > ScriptProcessorNode. Of course, the design of ScriptProcessorNode is > deeply flawed for all the reasons discussed at the conference > (Security, Inefficient due to context switching, potential for > breakups, etc.). However, what it does do is allow for one to process > events in sync with the audio thread, allowing to build formally > correct audio systems where one processes event time according to the > same time as is used by the audio nodes. Additionally, according to > those events, one can dynamically modify the graph (i.e. add new > instances of a sub-graph of nodes to the live graph, representing a > "note"), via reference to other nodes and the audio context. So while > flawed in terms of performance and security, it does allow one to > build correct systems that generate consistent output. > > My concern is that there was discussion of not only deprecating > ScriptProcessorNode, but removing it altogether. I would have no > problems with this, except that from reading the current specification > for AudioWorker, I do not see how it would be possible to create an > event system with it. While one can pass messages to and from an > AudioWorker, one has no access to the AudioContext. In that regards, > one can not say, within an AudioWorker, create new nodes and attach to > the context.destination. I am not very familiar with transferables and > what can be passed between the AudioWork and the JS Main thread via > postMessage, but I assume AudioNodes can not be made transferable. > > At this point, I'm questioning what can be done. It seems > AudioWorker's design is not meant for event processing (fair enough), > and ScriptProcessor can only do this by accident and not design. Is > there any solution to this problem with the Web Audio API moving > forward? For example, would this group be willing to consider > extending the API for non-audio nodes? (Processing nodes?). If > processing nodes could be added that has a larger context than what is > proposed for AudioWorkGlobalContext--say, has access to the > AudioContext, and can modify the audio node graph dynamically--I could > see it as a solution to allow building higher level constructs like an > event system. > > #6 - For the AudioWorker specification, I think it would be useful to > have clarification on when postMessage is processed. In 2.11.1.2, it > has a link to "the algorithm defined by the Worker Specification". > That in turn mentions: > > "The postMessage() method on DedicatedWorkerGlobalScope objects must > act as if, when invoked, it immediately invoked the method of the same > name on the port, with the same arguments, and returned the same > return value." > > If it meant to be processed immediately, then this can cause problems > if the AudioWorker is already part of a live graph and values mutate > while an audio worker is processing a block. I think it would be good > to have clarification on this, perhaps with a recommendation that in > onaudioprocess functions, one should make a local copy of a value of a > mutable value and use that for the duration of onaudioprocess to get a > consistent result for the block. > > #7 - Related to #6, I noticed in "2.11.3.1 A Bitcrusher Node", the > example uses a phaser variable that is scoped to the AudioWorker. I > assume this would then be on the heap. This is perhaps more of general > JS question, but I normally see in block-based audio programming that > for a process() function, one generally copies any state variables of > a node/ugen/etc. to local variables, runs the audio for-loop with > local variable, then saves the state for the next run. This is done > for performance (better locality, stack vs. heap access, better > compiler optimizations, etc.). I don't know much about JavaScript > implementations; can anyone comment if these kinds of optimizations > are effective in JS? If so, the example might benefit from rewriting > and give some guidance. (i.e. phase and lastDataValue are copied to a > local var before the for-loop, and saved again after the for-loop, in > onaudioprocess). > > Thanks! > steven > > > > . . . . . ...Joe > > Joe Berkovitz > President > > Noteflight LLC > Boston, Mass. > phone: +1 978 314 6271 > www.noteflight.com > "Your music, everywhere" > > > . . . . . ...Joe > > Joe Berkovitz > President > > Noteflight LLC > Boston, Mass. > phone: +1 978 314 6271 > www.noteflight.com > "Your music, everywhere" >
Received on Monday, 2 February 2015 17:30:51 UTC