Re: Background audio channels

On Mar 15, 2013, at 10:57 AM, Wesley Johnston <> wrote:

> In most situations, when the user puts a webpage in the background, any media being played by the page should be paused. Any attempts to play audio by a background page should also be prevented. However, for some sites (music or radio apps) the user would like to continue to hear the app while they do something else. These pages should be able to designate their audio as a type that should keep playing while in the background. The useragent should also attempt to avoid having the stream killed by the operating system if possible.

Why can't this just be handled by the UA?  MobileSafari, for instance, already supports playing audio while the app is "backgrounded".  It even supports playing and pausing <audio> elements with all the standard media playback controls. Were it to support this spec, it would break every page which does not explicity opt into the "background" channel.

>  This is especially true on mobile devices, but the problem is also already prevalent on desktop.

What does "in the background" mean in a desktop context?  A non-frontmost window?  Minimized?  A non-topmost tab?

> I think semantically we need a way to describe to the useragent how to play a particular track. I'd suggest we add an optional attribute to media elements, "audiochannel", designating the output and priority of this audio. The channel attribute can potentially take on three different values. "normal", "background", and "telephony". 
> "normal" channels are the default for all media elements. Using them doesn't require any special permissions. Audio playing with these channels is paused when the web page moves into the background. In addition, calling play on an media element with this channel while in the background will put the element into the paused for user interaction state (i.e. playback won't start until the webapp is brought to the foreground)?
> "background" channels will continue to play when the page is put into the background. Trying to play a background channel while in the background should also work. The ability to play audio on this channel may require requesting permission from the UA first (i.e. possibly a prompt when the audio is first played or when moving to the background). If the user doesn't grant permission, these should throw a MediaError (MEDIA_ERR_CHANNEL_PERMISSION_NOT_GRANTED?) so that the page can know what has happened and do something appropriate.

The "normal" channel will be incredibly frustrating, especially for mobile users.  For the overwhelming majority of <audio> use-cases, a user will be incredibly annoyed if audio pauses while switching tabs or switching to another app.  Every single page will have to update in order to opt into the "background" channel to get (what is currently the default) optimum experience.

If this spec is going to move forward, "background" should be the default.  "normal" should be opt-in, or removed entirely.

> "telephony" channels are similar to "background" channels and can play even if the page is in the background. Playing audio on a telephony channel may cause any audio playing on "normal" or "background" channels to be paused or have their volume severely decreased. They also, on devices where its supported, will likely play over handset speakers rather than normal speakers. Similar to "background", these may require permission from the UA.

Users already have "permission UI" to allow apps use of the handset speakers: the mute switch. Throwing up another permission dialog when the user is trying to answer a webapp "telephone" call is going to suck. (Presumably that webapp will also need permission to use the microphone, as well, so there will be multiple UA permission dialogs up.) And when some user accidentally grants a malicious site "telephony" permission, that site can now blare ads over their handset speakers, and the mute switch is powerless to stop it. 

Without the "ignore the mute switch" behavior this channel seems identical to "background".

> Note: This is all based rather loosely on the AudioChannels implementation written for B2G recently [1]. It includes a few other use-cases on its wiki page, along with definitions of additional channels to accomadate them. I've been trying to simplify it down to handle the most common use cases. Finding the correct terminology here is difficult though. For instance, it seems likely that games will see the background channel and think its an appropriate place to play game background music, the exact type of audio you'd like to have paused when you leave the game. Ideas for better ways to describe it are welcome.

This mechanism may make sense for installed apps.  iOS has a similar concept of "Audio Session Categories" [1] which govern how audio is routed, how audio apps interact with one another, how interruptions are handled, and whether playback resumes after an interruption.  However, exposing these app-level concepts to websites, especially in such a coarse-grained way, seems ill-advised.



Received on Wednesday, 10 April 2013 17:24:35 UTC