- From: Chris Rogers <crogers@google.com>
- Date: Mon, 19 Jul 2010 13:39:38 -0700
- To: Yury Delendik <async.processingjs@yahoo.com>
- Cc: public-xg-audio@w3.org
- Message-ID: <AANLkTiktoVMeVnp2-5oe6oU4PEAHSOS5QJ+tifFEiydB@mail.gmail.com>
Hi Yury, Thanks for the questions - I appreciate your input. On Fri, Jul 16, 2010 at 8:52 PM, Yury Delendik <async.processingjs@yahoo.com > wrote: > > Hello Chris, > > I'm trying to read and analyze the current proposal for the Web Audio API > at the > moment. The ideas expressed in the specification are straightforward and > simple. > The directional graph presentation of the audio processing nodes makes > simple to > visualize the signal flow. > > > > My feedback/questions: > > > 1) It took some time to gather all missing pieces of information from: the > examples, the SVN change log, and the public-xg-audio list. I had the > trouble to > understand why the examples have AudioMixerNode and there is no such node > in the > specification – this node type was in the previous versions. To make the > learning experience better, can the change log section be included in the > body > of the proposal/specification? > Sorry about the confusion with AudioMixerNode. Jer Noble suggested that we switch to use AudioGainNode instead. The proposal/specification document was changed very quickly. Later, I implemented AudioGainNode (while still leaving the old API working, but deprecated). Finally, just a few days ago, I changed all of the javascript sample code to use AudioGainNode instead of AudioMixerNode/AudioMixerNodeInput. My goal is to keep the demos/samples working at all times (with an up-to-date build of the WebKit audio branch). When changes happen here is the order: 1) When the API changes due to discussions on this list, I'll update the specification as soon as possible. 2) Sometime later, I'll manage to implement the change while striving to keep the old implementation working (as a deprecated API). 3) Still later, I'll change the javascript in the samples to match the new implementation. If possible, I'll try to execute step (3) immediately after (2), but sometimes there will be a time lag. I hope you'll appreciate the complexity I'm dealing with :) I'll try to put a change log section in the document as you suggest to keep track of these changes a little better. > > 2) Since the primary subject of the specification is AudioNode based > classes, it > will be beneficial to see possible values and details of the its primary > attributes: numberOfInputs and numberOfOutputs, e.g. > > > AudioBufferSourceNode > ================== > numberOfInputs = 0 > numberOfOutputs = 1 > Output #0 - Audio with same amount of channels and sampleRate as > specified in the AudioBuffer object > Good point. I'll try to add more detail in places such as this. I'll make a pass through the document today. Anytime you find some details which are missing, please let me know. > > 3) It looks like the RealtimeAnalyzerNode has special status: it does not > output > any audio data. What it really outputs: passes the data without change, > only > changes the signal gain (somebody recommended to add “gain” attribute to > the > AudioNode), or has no outputs? Can the RealtimeAnalyzerNode be used without > connecting it to the destination node? > This is a good question, and one which Ricard Marxer was also asking about. I was considering that analyser nodes would operate in a "pass-though" mode. In other words, one input and one output, with the input being passed unchanged to the output. I was anticipating that these nodes could be inserted anywhere in the signal chain to analyse, but would not otherwise interfere with the signal flow. But, I can see why this might be confusing and changing the analyser node to *only* have an input with no output might make more sense. In my original design the AudioDestinationNode was the only node which could be a "terminal" node in the graph, with everything being "pulled" by this node. But we could also allow analyser nodes to be "terminal" nodes (no outputs). I don't think there should be any technical issues preventing this in the implementation. I would be interested in hearing people's preference between the two approaches. If an analyser has no output, then in the JavaScript processing case we would now be faced with three types of nodes: 1) JavaScriptSourceNode 0 inputs : 1 output 2) JavaScriptProcessorNode N inputs : M output (could be 1 input : 1 output if we wish to keep it simple) 3) JavaScriptAnalyserNode 1 input : 0 outputs Although, maybe we can just have one JavaScriptAudioNode, and through different configuration (different constructor arguments?) it would end up being one of the above three. What do people think? > > 4) According section 16, it looks like the only object that can be used > without > context is AudioElementSourceNode that can be retrieved via audioSource > property. Is it correct? > Yes, it is the only object which can be retrieved without the context. But, it must be connected to other nodes which belong to a specific context. In the simplest case, it would be connected to "context.destination" > > 5) If the audio element will is playing the streaming data, will the sound > also > be “played” in the connected audio context? > This is a question Ricard Marxer and I have been discussing. My inclination is to view the act of connecting the audioSource from the audio element as implicitly disconnecting it from its "default" destination. So, it would never be audible both in the normal default way, and also audible from its processing in an explicitly constructed graph. Ricard has suggested that we would require a disconnect() call to make it inaudible in the normal/default playback path, but I'm not sure there would be any cases where it would desirable to *not* disconnect(), and simply forgetting to call it would sound very confusing and not be the desired result. > > 6) How many AudioContext instances are possible to run/instantiate on the > single > web page? > Ricard and I have also been discussing this. I think in the vast majority of cases a single AudioContext would be sufficient. Ricard brought up a case where there are two separate physical audio devices connected to the computer (one AudioContext for each one). But most sophisticated desktop audio software does not even support this scenario (especially when the devices are from different manufacturers or running at different sample-rates) . Maybe we can leave it an open question for now, or suggest that a more advanced implementation might later support multiple AudioContexts, but a simpler one would only allow one. > > 7) The JavaScript was chosen as a client-side scripting language to control > the > objects that are implemented on the high performance languages (typically > C/C++). One of the specifics of the JavaScript objects is to contain some > members that help to discover some meta data. In noticed that AudioBuffer > interface contains “length” attribute that bring different meaning to the > JavaScript “length” property (that usually has specifies the amount of the > members in the object). It's recommended to select names of the methods > that > will not conflict or change the meaning of the standard identifiers of the > target scripting language. > I think Eric Carlson suggested "length", other names I could suggest are: sampleFrameLength lengthInSampleFrames numberOfSampleFrames sampleFrameCount frameLength frameCount Anybody have any other ideas for names? For those who are confused by the use of the term "sample-frame", one sample-frame (or one frame) represents one sample per channel. So if there are N channels, then the number of samples per sample-frame is N. In other words, the sample-frame is the grouping of all samples across the N channels. > > > 8) Some of the class definitions are missing from the specification and > really > help to understand it: AudioSourceNode, AudioListenerNode, AudioSource, > AudioBufferSource, AudioCallbackSource, etc. > Thanks. I'll try to fill these in. > > 9) The Modular Routing section states that “the developer doesn't have to > worry > about low-level stream format details when two objects are connected > together; > the right thing just happens. For example, if a mono audio stream is > connected > to a stereo input it should just mix to left and right channels > appropriately.” > There are lots of ways/algorithms how to change the amount of the channels, > the > sample rate, etc. I think the web developer shall know what they will > receive as > a result: only the left channel or mix of all channels from the 5.1 source > stream. Could you document how “the right thing” will happen? > I'll add a section about this. It basically boils down to different ways of up-mixing and down-mixing different channel layouts (mono -> stereo, etc.) > > 10) How is the sampleRate attribute value defined/chosen for the non-source > nodes, e.g. AudioDestinationNode or AudioGainNode? In case when multiple > outputs > mixed in one input? > As I was discussing with Ricard, I would suggest that the sample-rate is constant for all nodes in a given AudioContext, and the sample-rate is an attribute on AudioContext. The document hasn't yet been updated to this, but I will if nobody objects. So, in this case there's no problem because we're mixing things all at the same sample-rate. > Thank you, > Yury Delendik > Thanks Yury, Chris
Received on Monday, 19 July 2010 20:40:08 UTC