Re: TPAC F2F and Spec Proposals (was: Attendance for the AudioWG F2F meeting on Monday, 31 October) from Robert O'Callahan on 2011-10-17 (public-audio@w3.org from October to December 2011)

From: Robert O'Callahan <robert@ocallahan.org>
Date: Tue, 18 Oct 2011 12:01:13 +1300
To: Joseph Berkovitz <joe@noteflight.com>
Cc: Alistair MacDonald <al@signedon.com>, Doug Schepers <schepers@w3.org>, tmichel@w3.org, Philippe Le Hegaret <plh@w3.org>, public-audio@w3.org, mgregan@mozilla.com
Message-ID: <CAOp6jLb30xWRX-87kOUjfwn1L5uOkRf46OP1JJPbvyGj8OXDog@mail.gmail.com>

These are great questions.

On Tue, Oct 18, 2011 at 3:52 AM, Joseph Berkovitz <joe@noteflight.com>wrote:

> 1. Are MediaStreams capable of supporting, as Chris put it, "large numbers
> of short, overlapping sounds which have extremely stringent requirements in
> terms of timing, latency, mixing, and performance."  This is not a question
> about whether the abstractions can be identified; it seems likely to me that
> they can be, in an API-completeness sense. But this is a question of
> whether *a concrete implementation* will have a tough time supporting the
> above requirement, given its need to also support the stream-specific
> responsibilities of MediaStreams.
>
> I'd love to have a completely built-out code sample but I think I can make
> the point with something smaller.  A typical music soft-synth might call the
> following sort of code block (adapted from ROC's example #11) in quick
> succession many times on the same effectsMixer object, resulting in, say,
> 100s of potentially overlapping inputs waiting for their turn to be
> scheduled:
>
>   function triggerSound(audio, offset) {
>     var stream = audio.captureStream();
>     audio.play();
>     var port = effectsMixer.addInput(stream, offset);
>     stream.onended = function() { port.remove(); }
>   }
>
> Furthermore it is a requirement that the same Audio object can be played
> simultaneously through the mixer with different playbackRates, amplitudes
> and mixdown parameters -- this is how a typical instrumental wavetable synth
> works. Will the approach of piping Audio objects through a mixer stream play
> nice with that requirement? Does captureStream() always return the same
> object for a given Audio being captured? If so, that might be a problem.
>

To play the same media element multiple times, you'll have to clone it. This
isn't much code --- var e = element.clone(); var stream = e.captureStream();
e.play(); --- and we can make it less code. Browsers would need to make sure
that's efficient. (Making element.clone().play() efficient is a good idea
anyway, since it's the simplest API for applications that simply want to
play preloaded sounds in response to events.)

If necessary, something similar to AudioBufferNode could be added to
MediaStreams quite easily --- new DataAudioStream(channels, rate,
array/blob) or something like that.

2. What concrete features of the Web Audio API, if any, support significant
> use cases that the MediaStream proposal does not?  (This is a question about
> broad concepts, not about, say, whether some particular effect is available
> or not.)  I'll throw out a couple of points that I think might qualify:
> - AudioParams provide for smooth automation of arbitrary time-varying
> parameters, rather than having a new value be set in a step function when a
> stream reaches a stable state. The ability to supply linear or exponential
> ramps for such parameters is an essential facet of any soft-synth.
>

In ProcessedMediaStreams you can attach arbitrary parameter objects to
streams and MediaInputs. Those objects can contain data representing the
parameters of time-varying functions. The base time at which a parameter
object became current is available to processing Workers (or native effects)
via 'paramsStartTime' to simplify use of these objects. This approach gives
unlimited flexibility to JS processing nodes to define the structure of
their parameters, the types of the parameters, and the interpolation
functions. For example, a panner processing node could accept parameters on
its MediaInputs of the form
  { x: { interpolate:"linear", from: 100, to: 200, t:10 },
    y: { interpolate:"exponential", from: 100, to: 200, t:5 },
    z: 50 }
or
  { interpolate:"linear",
    from: { x:100, y:100, z:50 },
    to: { x:200, y:200, z:50 },
    t: 5 }
or both, or something else entirely.

Something like AudioParams could be added to MediaStreams if necessary ...
but I'd want to make it more generic than "AudioParams" since they'd be just
as useful for controlling video parameters.

        - AudioBuffers allow the same sample data to be shared between more
> than one AudioBufferPlaybackNode.
>

clone()ing a media element can be implemented to share sample buffers. We
will experiment to see how much sharing is needed.

- the noteOff() function allows the end of a stream to be pegged to a time
> offset, not just the start of it
>

MediaInput.remove() takes a time parameter to achieve that.


> - there is a notion of global time across the AudioNode graph. In the
> MediaStream case, currentTime gives the time since a specific stream was
> created, which is not as useful (I suspect there's a way to address this
> need that I'm just not seeing).
>

currentTime is the amount of data played on that stream. I think a global
time isn't so useful if some streams can block while others play. By
default, times for connected streams advance in lockstep so for most uses I
can think of, one stream's time is as good as another's. But I'm interested
in hearing more about the needs here.

Rob
-- 
"If we claim to be without sin, we deceive ourselves and the truth is not in
us. If we confess our sins, he is faithful and just and will forgive us our
sins and purify us from all unrighteousness. If we claim we have not sinned,
we make him out to be a liar and his word is not in us." [1 John 1:8-10]

Received on Monday, 17 October 2011 23:01:42 UTC