Re: Requirements for Web audio APIs from Chris Rogers on 2011-05-23 (public-audio@w3.org from April to June 2011)

From: Chris Rogers <crogers@google.com>
Date: Sun, 22 May 2011 17:11:33 -0700
To: robert@ocallahan.org
Cc: public-audio@w3.org
Message-ID: <BANLkTiksdxx=d2Z0jjfXmKLmpYmu2bgNyQ@mail.gmail.com>

On Sun, May 22, 2011 at 2:39 PM, Robert O'Callahan <robert@ocallahan.org>wrote:

> On Sat, May 21, 2011 at 7:17 AM, Chris Rogers <crogers@google.com> wrote:
>
>> On Thu, May 19, 2011 at 2:58 AM, Robert O'Callahan <robert@ocallahan.org>wrote:
>>
>>> My concern is that having multiple abstractions representing streams of
>>> media data --- AudioNodes and Streams --- would be redundant.
>>>
>>
>> Agreed, there's a need to look at this carefully.  It might be workable if
>> there were appropriate ways to easily use them together even if they remain
>> separate types of objects.  In graphics, for example, there are different
>> objects such as Image, ImageData, and WebGL textures which have different
>> relationships with each other.  I don't know what the right answer is, but
>> there are probably various reasonable ways to approach the problem.
>>
>
> There are reasons why we need to have different kinds of image objects. For
> example, a WebGL texture has to live in VRAM so couldn't have its pixel data
> manipulated by JS the way an ImageData object can. Are there fundamental
> reasons why AudioNodes and Streams have to be different ... why we couldn't
> express the functionality of AudioNodes using Streams?
>

I didn't say they *have* to be different.  I'm just saying that there might
be reasonable ways to have AudioNodes and Streams work together. I could
also turn the question around and ask if we could express the functionality
of Streams using AudioNodes?



>
>  That sounds good, but I was thinking of other sorts of problems. Consider
>>> for example the use-case of a <video> movie with a regular audio track, and
>>> an auxiliary <audio> element referencing a commentary track, where we apply
>>> an audio ducking effect to overlay the commentary over the regular audio.
>>> How would you combine audio from both streams and keep everything in sync
>>> (including the video), especially in the face of issues such as one of the
>>> streams temporarily pausing to buffer due to a network glitch?
>>>
>>
>> In general this sounds like a very difficult problem to solve.  Because if
>> you had two <video> streams playing together, then either one of them could
>> pause momentarily due to buffer underrun, so each one would have to adjust
>> to the other.  Then if you had more than two, any of them could require
>> adjustments in all of the others.
>>
>
> That's right. I agree it's hard, but I think we need to solve it, or at
> least have a plausible plan to solve it. This is not a far-fetched use-case.
>

Sure, I don't disagree that it might be useful.  I'm just suggesting that
solving this problem is something that can be done at the HTMLMediaElement
streaming level.  Once these media elements are synced up (assuming somebody
specs that out and implements that for HTMLMediaElement) then these elements
are ready to be inserted into a processing graph using the Web Audio API.

Cheers,
Chris

Received on Monday, 23 May 2011 00:12:00 UTC