WebAudio Implementation feedback

Hi all,

I would like to raise an issue with the WG which has come up during our (Apple's) implementation of the WebAudio API.

We were trying to decide what the correct behavior of the API should be under various system scenarios. A few (entirely hypothetical) examples:

Should the audio from an AudioContext continue to play when the owning tab is not frontmost? 
Should an AudioContext be allowed to play without being initiated with a user gesture?
Should audio from an AudioContext cause other audio on the system to halt?
What should happen when AudioContexts from two separate pages decide to play at the same time?

These are all UA decisions which will probably vary greatly from platform to platform, but given the wide breadth of use cases which this API will enable, we found it very difficult to decide what the correct default behaviors should be without more information about what use case is being attempted.  For some use cases, the behavior is pretty clear:

If the use case is simulating UI sounds (*beeping* when the user clicks a button), audio should mux with system audio and other in-page audio.  Limiting audio to only visible tabs would be fine here.
For background music, the UA might want to halt or duck other system level background audio (by pausing Window Media player or iTunes, for example).  Perhaps the UA would want to limit background audio to only a single instance at a time.  The UA might still want to mute this audio when the user switches to another tab.
For the podcast case, the limitations might be the same as above, but might additionally want to relax the frontmost tab restriction.  In return, the UA might insist that the AudioContext be created during a user gesture, or by displaying a permission banner.

So I'd like to propose the addition of an optional parameter to the AudioContext constructor, entitled, for the lack of a better word, "intent".  I'll illustrate with a hypothetical Web Audio API use case, a web app which recreates the LCARS UI from STtNG, and the restrictions and privileges a hypothetical UA might impose and grant:

Intent
Privileges
Restrictions
Use Case
"foreground"
None.
Muted when not in frontmost tab/window.
Play a subtle "chime" when the user clicks or touches various UI elements.
"background"
Mutes other "background" AudioContexts.
Muted when not in frontmost tab/window.
Must begin with a user interaction.
Play the gentle hum of a warp engine.
"media"
Mutes other "media" and "background" AudioContexts.
Pauses other system media playback.
Plays when not frontmost.
Must get explicit permission from the user.
Play the theme song of STtNG.

In the absence of an "intent" argument, the UA would just pick a default intent, probably the one with the fewest privileges and restrictions.

What do people think?

-Jer

Received on Tuesday, 28 February 2012 23:26:40 UTC