Re: Web Audio API spec review from Philip Jägenstedt on 2012-05-16 (public-audio@w3.org from April to June 2012)

From: Philip Jägenstedt <philipj@opera.com>
Date: Wed, 16 May 2012 16:55:20 +0200
To: "Chris Rogers" <crogers@google.com>
Cc: public-audio@w3.org
Message-ID: <op.weeo2iepsr6mfa@kirk>

On Tue, 15 May 2012 19:59:07 +0200, Chris Rogers <crogers@google.com>  
wrote:

> On Tue, May 15, 2012 at 4:45 AM, Philip Jägenstedt  
> <philipj@opera.com>wrote:

>> There are a few aspects that make the Web Audio API fit poorly with the
>> rest of the Web platform. For example, the integration with
>> HTMLMediaElement is one-way; the audio stream of a <video> can be passed
>> into AudioContext but the result cannot leave AudioContext or play in  
>> sync
>> with the video channel. That an AudioContext cannot be paused means that
>> certain filtering effects on any stallable input (<audio>, MediaStream)
>> cannot be implemented, echo or reverb being the most obvious examples.
>>
>
> I don't believe there are any fundamental serious limitations here.  For
> example, today it's possible to pause an <audio> element and have the
> reverb tail continue to play, to fade-out slowly/quickly, or stop right
> away.  We can discuss in more detail if you have some very specific use
> cases.

The missing option is to simply play the echo when the audio element  
continues playing, as would be the case for a pre-mixed audio track with  
echo in it.

Let's take the case of audio descriptions. A WebVTT files contains timed  
text to be synthesized at a particular point in time and mixed with the  
main audio track. Assume that the speech audio buffer is available, it has  
been pre-generated either on a server or using a JavaScript speech synth  
engine. That audio buffer must be mixed at a particular time and slightly  
before and after that the main audio must be "ducked", i.e. the volume  
should be ramped down and eventually back up again.

AFAICT the timestamps from the media resource are lost as soon as the  
audio enters the Web Audio API, so the only way to know where to apply the  
ramp is by polling video.currentTime. If the media element pauses or  
stalls you have to take great care to reschedule those ramps once it  
starts playing again. (Failure to realize this edge case will result in a  
poor experience when pausing and unpausing.)

Finally, if the audio audio processing pipeline adds any delay to the  
signal, there's no way to get it back in sync with the video.

Related issues:

https://www.w3.org/2011/audio/track/issues/55
https://www.w3.org/2011/audio/track/issues/56

I don't believe there's an issue for the fact that it's not possible to  
take the mixed audio and send it over WebRTC, but that wasn't exactly the  
issue at hand.

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Wednesday, 16 May 2012 14:55:47 UTC