Requirements for Web audio APIs

There are a few important requirements for a Web audio API that aren't
satisfied by the current Mozilla and Chrome proposals. Some of these
requirements have arisen very recently.

1) Integrate with media capture and peer-to-peer streaming APIs
There's a lot of energy right now around APIs and protocols for real-time
communication in Web browsers, in particular proposed WHATWG APIs for media
capture and peer-to-peer streaming:
http://www.whatwg.org/specs/web-apps/current-work/complete/video-conferencing-and-peer-to-peer-communication.html
Ian Hickson's proposed API creates a "Stream" abstraction representing a
stream of audio and video data. Many use-cases require integration of media
capture and/or peer-to-peer streaming with audio effects processing.

2) Need to handle streams containing synchronized audio and video
Many use-cases require effects to be applied to an audio stream which is
then played back alongside a video track with synchronization. This can
require the video to be delayed, so we need a framework that handles both
audio and video. Also, the WHATWG Stream abstraction contains video as well
as audio, so integrating with it will mean pulling in video.

3) Need to handle synchronization of streams from multiple sources
There's ongoing work to define APIs for playing multiple media resources
with synchronization, including a WHATWG proposal:
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#mediacontroller
Many use-cases require audio effects to be applied to some of those streams
while maintaining synchronization.

4) Worker-based Javascript audio processing
Authors will always need custom audio effects or synthesis not supported
directly in an audio spec. We need a way to produce such effects
conveniently in Javascript with best possible performance, especially
latency. Processing audio in Web Workers would insulate the effects code
from latency caused by tasks on the HTML event loop. Workers have logically
separate heaps so garbage-collection latency can also be minimized.

I have put a sketch of an API proposal here that attempts to address those
requirements:
https://wiki.mozilla.org/MediaStreamAPI
It's only a week old and I don't think it's ready to formally propose to a
Working Group. I feel it needs at least a prototype implementation, both to
flesh out the parts that are unclear and to ensure that it's actually
implementable. I plan to do that ASAP. However, given the interest in this
area I want to let people know what we are thinking about, so that there are
no surprises later.

BTW, constructive feedback welcome, but I'm more interested in getting the
concepts right than picking over details.

Thanks,
Rob
-- 
"Now the Bereans were of more noble character than the Thessalonians, for
they received the message with great eagerness and examined the Scriptures
every day to see if what Paul said was true." [Acts 17:11]

Received on Wednesday, 13 April 2011 17:45:10 UTC