Re: Web Audio API spec review from Chris Rogers on 2012-05-16 (public-audio@w3.org from April to June 2012)

From: Chris Rogers <crogers@google.com>
Date: Wed, 16 May 2012 12:41:19 -0700
To: Philip Jägenstedt <philipj@opera.com>
Cc: public-audio@w3.org
Message-ID: <CA+EzO0kY1fb5eLeZ=rDeBUzuzLpn=p8XrSEZj4TPYrxmKT50gg@mail.gmail.com>
On Wed, May 16, 2012 at 7:55 AM, Philip Jägenstedt <philipj@opera.com>wrote:

> On Tue, 15 May 2012 19:59:07 +0200, Chris Rogers <crogers@google.com>
> wrote:
>
>  On Tue, May 15, 2012 at 4:45 AM, Philip Jägenstedt <philipj@opera.com
>> >wrote:
>>
>
>  There are a few aspects that make the Web Audio API fit poorly with the
>>> rest of the Web platform. For example, the integration with
>>> HTMLMediaElement is one-way; the audio stream of a <video> can be passed
>>> into AudioContext but the result cannot leave AudioContext or play in
>>> sync
>>> with the video channel. That an AudioContext cannot be paused means that
>>> certain filtering effects on any stallable input (<audio>, MediaStream)
>>> cannot be implemented, echo or reverb being the most obvious examples.
>>>
>>>
>> I don't believe there are any fundamental serious limitations here.  For
>> example, today it's possible to pause an <audio> element and have the
>> reverb tail continue to play, to fade-out slowly/quickly, or stop right
>> away.  We can discuss in more detail if you have some very specific use
>> cases.
>>
>
> The missing option is to simply play the echo when the audio element
> continues playing, as would be the case for a pre-mixed audio track with
> echo in it.
>

I don't understand exactly what you mean here.  It would certainly be
possible to continue processing with a reverb effect if the audio element
resumed playing from a paused state.


>
> Let's take the case of audio descriptions. A WebVTT files contains timed
> text to be synthesized at a particular point in time and mixed with the
> main audio track. Assume that the speech audio buffer is available, it has
> been pre-generated either on a server or using a JavaScript speech synth
> engine. That audio buffer must be mixed at a particular time and slightly
> before and after that the main audio must be "ducked", i.e. the volume
> should be ramped down and eventually back up again.


> AFAICT the timestamps from the media resource are lost as soon as the
> audio enters the Web Audio API, so the only way to know where to apply the
> ramp is by polling video.currentTime. If the media element pauses or stalls
> you have to take great care to reschedule those ramps once it starts
> playing again. (Failure to realize this edge case will result in a poor
> experience when pausing and unpausing.)
>
> Finally, if the audio audio processing pipeline adds any delay to the
> signal, there's no way to get it back in sync with the video.
>

We've already had extensive and very detailed discussions about latency and
synchronization, including issue 56 below.  Timed text events could
certainly be used to apply appropriate audio processing:
http://lists.w3.org/Archives/Public/public-audio/2012AprJun/0066.html
http://lists.w3.org/Archives/Public/public-audio/2012AprJun/0084.html
http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0475.html


>
> Related issues:
>
> https://www.w3.org/2011/audio/**track/issues/55<https://www.w3.org/2011/audio/track/issues/55>
> https://www.w3.org/2011/audio/**track/issues/56<https://www.w3.org/2011/audio/track/issues/56>


I'm happy that Opera is examining the specification and has raised many
issues/questions about it.  I've had a chance to look through many of the
issues, at least in passing, and agree with most of the issues, which
effectively amounts to providing more detail.  This is something I'm happy
to do, but it will take time.


>
>
> I don't believe there's an issue for the fact that it's not possible to
> take the mixed audio and send it over WebRTC, but that wasn't exactly the
> issue at hand.


Philip, I'm sorry that you were not present at the last W3C audio
teleconference.  At that time we agreed to add the proposed
createMediaStreamSource() and createMediaStreamDestination() methods into
the editor's draft as described in this side document, but adding more
detail:
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/webrtc-integration.html

We're actually working on a concrete implementation in Chrome right now to
test how well these work in practice.  I am optimistic about this approach,
but we're still in the stage of getting prototypes running to test it out
in practice.

Best Regards,
Chris


>
>
> --
> Philip Jägenstedt
> Core Developer
> Opera Software
>
Received on Wednesday, 16 May 2012 19:41:50 UTC