Re: Requirements for Web audio APIs

On Sun, May 22, 2011 at 6:11 PM, Robert O'Callahan <>wrote:

> On Mon, May 23, 2011 at 12:11 PM, Chris Rogers <> wrote:
>> On Sun, May 22, 2011 at 2:39 PM, Robert O'Callahan <>wrote:
>>> On Sat, May 21, 2011 at 7:17 AM, Chris Rogers <>wrote:
>>>>  On Thu, May 19, 2011 at 2:58 AM, Robert O'Callahan <
>>>>> wrote:
>>>>> My concern is that having multiple abstractions representing streams of
>>>>> media data --- AudioNodes and Streams --- would be redundant.
>>>> Agreed, there's a need to look at this carefully.  It might be workable
>>>> if there were appropriate ways to easily use them together even if they
>>>> remain separate types of objects.  In graphics, for example, there are
>>>> different objects such as Image, ImageData, and WebGL textures which have
>>>> different relationships with each other.  I don't know what the right answer
>>>> is, but there are probably various reasonable ways to approach the problem.
>>> There are reasons why we need to have different kinds of image objects.
>>> For example, a WebGL texture has to live in VRAM so couldn't have its pixel
>>> data manipulated by JS the way an ImageData object can. Are there
>>> fundamental reasons why AudioNodes and Streams have to be different ... why
>>> we couldn't express the functionality of AudioNodes using Streams?
>> I didn't say they *have* to be different.  I'm just saying that there
>> might be reasonable ways to have AudioNodes and Streams work together. I
>> could also turn the question around and ask if we could express the
>> functionality of Streams using AudioNodes?
> Indeed! One answer to that would be that Streams contain video so
> "AudioNode" isn't a great name for them :-).
> If they don't have to be different, then they should be unified into a
> single abstraction. Otherwise APIs that work on media streams would have to
> come in an AudioNode version and a Stream version, or authors would have to
> create explicit bridges.

For connecting an audio source from an HTMLMediaElement into an audio
processing graph using the Web Audio API, I've suggested adding an
.audioSource attribute.  A code example with diagram is here in my proposal:

I'm fairly confident that this type of approach will work well for
HTMLMediaElement.  Basically, it's a "has-a" design instead of "is-a"

Similarly, for Streams I think the same type of approach could be
considered.  I haven't looked very closely at the proposed media stream API
yet, but would like to explore that in more detail.  If we adopt the "has-a"
(instead of "is-a") design then the problem of AudioNode not being a good
name for Stream disappears.

>>>  That sounds good, but I was thinking of other sorts of problems.
>>>>> Consider for example the use-case of a <video> movie with a regular audio
>>>>> track, and an auxiliary <audio> element referencing a commentary track,
>>>>> where we apply an audio ducking effect to overlay the commentary over the
>>>>> regular audio. How would you combine audio from both streams and keep
>>>>> everything in sync (including the video), especially in the face of issues
>>>>> such as one of the streams temporarily pausing to buffer due to a network
>>>>> glitch?
>>>> In general this sounds like a very difficult problem to solve.  Because
>>>> if you had two <video> streams playing together, then either one of them
>>>> could pause momentarily due to buffer underrun, so each one would have to
>>>> adjust to the other.  Then if you had more than two, any of them could
>>>> require adjustments in all of the others.
>>> That's right. I agree it's hard, but I think we need to solve it, or at
>>> least have a plausible plan to solve it. This is not a far-fetched use-case.
>> Sure, I don't disagree that it might be useful.  I'm just suggesting that
>> solving this problem is something that can be done at the HTMLMediaElement
>> streaming level.  Once these media elements are synced up (assuming somebody
>> specs that out and implements that for HTMLMediaElement) then these elements
>> are ready to be inserted into a processing graph using the Web Audio API.
> That might work, I'm not sure yet. But it would require the author to
> figure out the synchronization requirements of the audio graph and restate
> those requirements to the media elements.

If there's an explicit API created for HTMLMediaElement allowing for
synchronization between multiple elements as you describe, then the author
would have to use this API on <audio> and <video> elements to setup the
synchronization.  But, I think that could be separated from  the Web Audio
API / audio-graph aspects.  The audio graph latency compensation (if any)
could be handled "under-the-hood" without the author's intervention.


Received on Monday, 23 May 2011 01:55:57 UTC