Re: WebAudio Implementation feedback

Hi Jer, interesting questions.  I see nobody has chimed in yet, so I will!

On Tue, Feb 28, 2012 at 3:24 PM, Jer Noble <jer.noble@apple.com> wrote:

> Hi all,
>
> I would like to raise an issue with the WG which has come up during our
> (Apple's) implementation of the WebAudio API.
>
> We were trying to decide what the correct behavior of the API should be
> under various system scenarios. A few (entirely hypothetical) examples:
>
>
>    - Should the audio from an AudioContext continue to play when the
>    owning tab is not frontmost?
>
> I think a specific user agent could implement this differently, but muting
audio on a hidden tab *might* be a reasonable default.  I guess this
corresponds to your "foreground" mode?  I would make the distinction here
between hidden tab and a visible window which is not front-most.  Perhaps
the page visibility API comes in here?  One possible answer to your
question is that it should be the author's decision whether to mute the
audio (by stopping playback or programmatically turning down gain) based on
events from the page visibility API, and that no hints in the AudioContext
constructor are needed.



>
>    - Should an AudioContext be allowed to play without being initiated
>    with a user gesture?
>
> I hope the answer is yes.  I think it would be unacceptably cumbersome in
the common/default case.  In most user agents, an <audio> element is
allowed to play normally without user permission.  I think most people
would expect to be able to visit a page and hear audio in a simple and
straight-forward manner.  The one exception, I suppose, is the "exclusive
access" mode which you describe next:

>
>    - Should audio from an AudioContext cause other audio on the system to
>    halt?
>
> I don't think it should, at least not when running in a desktop browser.
 Using Mac OS X as an example, each application is allowed to generate
audio, and it's up to the user to control playback in each.  For example,
iTunes may be playing music at the same time as a user is playing a game in
another application, with the audio being mixed.  I understand that you may
be approaching this from the iOS perspective which is slightly different
than the desktop OS, so maybe there could be special hints to request
exclusive access as you suggest.  But, in general, running on a desktop OS,
it may be difficult to gain exclusive access to the audio hardware, and
even if possible would be considered annoying - at least that's my feeling.

So in short, it shouldn't be the default behavior, but maybe have your
special hint which requires permission from the user as you suggest. I'm
not sure that the hint could be respected for all user agents.



>
>    - What should happen when AudioContexts from two separate pages decide
>    to play at the same time?
>
> That's a very interesting question, considering that we're not talking
about hidden tabs, but two windows side-by-side and both visible. I think
the default behavior should be that both generate sound.  I've often
*intentionally* played around with two web audio pages playing
simultaneously.  There are definitely real-world uses for that, and I think
it should be the default behavior.

On iOS Safari I don't know if there's a concept of multiple visible pages,
or only one visible page and hidden tabs.  If there's just one visible page
then I guess that's the case of your first bullet point (above).



>
> These are all UA decisions which will probably vary greatly from platform
> to platform, but given the wide breadth of use cases which this API will
> enable, we found it very difficult to decide what the correct default
> behaviors should be without more information about what use case is being
> attempted.  For some use cases, the behavior is pretty clear:
>
>
>    - If the use case is simulating UI sounds (*beeping* when the user
>    clicks a button), audio should mux with system audio and other in-page
>    audio.  Limiting audio to only visible tabs would be fine here.
>
> Seems like a reasonable "default" case.



>
>    - For background music, the UA might want to halt or duck other system
>    level background audio (by pausing Window Media player or iTunes, for
>    example).  Perhaps the UA would want to limit background audio to only a
>    single instance at a time.  The UA might still want to mute this audio when
>    the user switches to another tab.
>    - For the podcast case, the limitations might be the same as above,
>    but might additionally want to relax the frontmost tab restriction.  In
>    return, the UA might insist that the AudioContext be created during a user
>    gesture, or by displaying a permission banner.
>
>
> So I'd like to propose the addition of an optional parameter to the
> AudioContext constructor, entitled, for the lack of a better word,
> "intent".  I'll illustrate with a hypothetical Web Audio API use case, a
> web app which recreates the LCARS UI from STtNG<http://en.wikipedia.org/wiki/LCARS>,
> and the restrictions and privileges a hypothetical UA might impose and
> grant:
>
>   *Intent*
>  *Privileges*
>  *Restrictions*
>  *Use Case*
>   "foreground"
>  None.
>  Muted when not in frontmost tab/window.
>  Play a subtle "chime" when the user clicks or touches various UI
> elements.
>   "background"
>  Mutes other "background" AudioContexts.
>  Muted when not in frontmost tab/window.
> Must begin with a user interaction.
>  Play the gentle hum of a warp engine.
>   "media"
>  Mutes other "media" and "background" AudioContexts.
> Pauses other system media playback.
> Plays when not frontmost.
>  Must get explicit permission from the user.
>  Play the theme song of STtNG.
>
> In the absence of an "intent" argument, the UA would just pick a default
> intent, probably the one with the fewest privileges and restrictions.
>
> What do people think?
>
> -Jer
>

Received on Thursday, 1 March 2012 02:17:17 UTC