Re: WebAudio Implementation feedback from Chris Wilson on 2012-03-02 (public-audio@w3.org from January to March 2012)

From: Chris Wilson <cwilso@google.com>
Date: Fri, 2 Mar 2012 12:12:15 -0800
To: Chris Rogers <crogers@google.com>
Cc: Jer Noble <jer.noble@apple.com>, public-audio@w3.org
Message-ID: <CAJK2wqVSwskut4WLGB9V4GkOLWKFymx9O4n9u-8OJG6oWNciow@mail.gmail.com>
Yeesh.  Actually, I had a long back-and-forth with Jer in response to this,
and only realized after you sent this that I'd hit reply, not reply-all.
 Thread pasted for your amusement.

I said:

Not sure I understand your "use case" column.  :)

Interesting idea.  I'd suggest that the same restrictions should really
somehow apply to <audio>, et al, though - and they don't, historically.  It
seems a bit odd to put additional restrictions on Web Audio.

Additionally, the app developer can implement this use case with the Page
Visibility API (http://www.w3.org/TR/2011/WD-page-visibility-20110602/).


Jer said:

On Feb 28, 2012, at 3:45 PM, Chris Wilson <cwilso@google.com> wrote:

Not sure I understand your "use case" column.  :)

Yeah, I should probably have mapped them to the official use cases from <
http://www.w3.org/2011/audio/wiki/Use_Cases_and_Requirements>.

Interesting idea.  I'd suggest that the same restrictions should really
somehow apply to <audio>, et al, though - and they don't, historically.  It
seems a bit odd to put additional restrictions on Web Audio.

Keep in mind that there are some platforms which have these same
restrictions on <audio> elements, including the Android Chrome browser.

Additionally, the app developer can implement this use case with the Page
Visibility API (http://www.w3.org/TR/2011/WD-page-visibility-20110602/).

The idea would be that this is not up to the app developer, but rather to
the UA.


Then I said:

On Tue, Feb 28, 2012 at 3:49 PM, Jer Noble <jer.noble@apple.com> wrote:

> On Feb 28, 2012, at 3:45 PM, Chris Wilson <cwilso@google.com> wrote:
>
Not sure I understand your "use case" column.  :)
>
> Yeah, I should probably have mapped them to the official use cases from <
> http://www.w3.org/2011/audio/wiki/Use_Cases_and_Requirements>.
>

Not necessarily.  I just meant under no circumstances would I want
unmaskable STTNG theme playing - and it seemed like the hum of the engines
would mute all other background sounds?

> Interesting idea.  I'd suggest that the same restrictions should really
> somehow apply to <audio>, et al, though - and they don't, historically.  It
> seems a bit odd to put additional restrictions on Web Audio.
>
> Keep in mind that there are some platforms which have these same
> restrictions on <audio> elements, including the Android Chrome browser.
>
Additionally, the app developer can implement this use case with the Page
> Visibility API (http://www.w3.org/TR/2011/WD-page-visibility-20110602/).
>
> The idea would be that this is not up to the app developer, but rather to
> the UA.
>

Right, I get that this would put the UA in control, not the app developer.
 I didn't realize that the Android Chrome browser did this (and I expect
iOS does too?)  I guess what I was saying is that I could see this as a
"sound visibility API" - reporting back whether it's playing or not - but
if I read your suggestion correctly, this would mean if I was writing a web
app, and running in a desktop context, I would have to explicit get
approval from the user in order to keep playing in the background.  That's
an additional constraint on desktop (and a common scenario - I have a
buried Pandora tab playing most of the time) - and would only be on the Web
Audio API, as <audio> wouldn't have to ask.


Then Jer said:

On Feb 28, 2012, at 3:58 PM, Chris Wilson <cwilso@google.com> wrote:

On Tue, Feb 28, 2012 at 3:49 PM, Jer Noble <jer.noble@apple.com> wrote:

On Feb 28, 2012, at 3:45 PM, Chris Wilson <cwilso@google.com> wrote:
>
Not sure I understand your "use case" column.  :)
>
> Yeah, I should probably have mapped them to the official use cases from <
> http://www.w3.org/2011/audio/wiki/Use_Cases_and_Requirements>.
>
Not necessarily.  I just meant under no circumstances would I want
unmaskable STTNG theme playing - and it seemed like the hum of the engines
would mute all other background sounds?

Under this hypothetical scenario, you'd mux the engine sound in with the
theme song, if it were available.

Interesting idea.  I'd suggest that the same restrictions should really
> somehow apply to <audio>, et al, though - and they don't, historically.  It
> seems a bit odd to put additional restrictions on Web Audio.
>
> Keep in mind that there are some platforms which have these same
> restrictions on <audio> elements, including the Android Chrome browser.
>
Additionally, the app developer can implement this use case with the Page
> Visibility API (http://www.w3.org/TR/2011/WD-page-visibility-20110602/).
>
> The idea would be that this is not up to the app developer, but rather to
> the UA.
>
Right, I get that this would put the UA in control, not the app developer.
 I didn't realize that the Android Chrome browser did this (and I expect
iOS does too?)  I guess what I was saying is that I could see this as a
"sound visibility API" - reporting back whether it's playing or not - but
if I read your suggestion correctly, this would mean if I was writing a web
app, and running in a desktop context, I would have to explicit get
approval from the user in order to keep playing in the background.  That's
an additional constraint on desktop (and a common scenario - I have a
buried Pandora tab playing most of the time) - and would only be on the Web
Audio API, as <audio> wouldn't have to ask.

Like I said, this is just hypothetical.  Getting explicit user approval
sucks and usually results in bad, annoying UI.  Usually, it's enough to
either imply permission (as in requiring a user gesture) or ask forgiveness
(tell the user what's happening and give them a chance to disable it).

And you're right that it'd be weird for there to be more restrictions on an
<audio> element than on the equivalent WebAudio context.  But the UA would
see to it that their restrictions were basically equivalent.

But I'm not suggesting that the spec mandate UA behavior (such as requiring
user interaction), but rather that the API provide enough information so
that the UA can make an informed decision about what behavior to apply.
 Desktop UAs may not choose to apply any restrictions at all.

And then I said:

On Tue, Feb 28, 2012 at 4:07 PM, Jer Noble <jer.noble@apple.com> wrote:

> But I'm not suggesting that the spec mandate UA behavior (such as
> requiring user interaction), but rather that the API provide enough
> information so that the UA can make an informed decision about what
> behavior to apply.  Desktop UAs may not choose to apply any restrictions at
> all.
>

But if the API states an intent that the UA should fade sounds when I
switch apps, then the app developers are going to rely on that, rather than
doing it themselves.  A game app, for example, is just going to state the
intent of "foreground", and expect the UA to deal with turning off the loud
engine noise when I switch to another tab.

At the very least, I think you'd need this API to answer back to the app
developer what is ACTUALLY going to happen (a la the Page Visibility API),
not just have the app developer state their intent and leave it up to the
UA - or you are pretty much mandating UA behavior (and unevenly, across Web
Audio and other media APIs).


And finally, Jer responded:

On Feb 28, 2012, at 4:14 PM, Chris Wilson <cwilso@google.com> wrote:

But if the API states an intent that the UA should fade sounds when I
switch apps, then the app developers are going to rely on that, rather than
doing it themselves.  A game app, for example, is just going to state the
intent of "foreground", and expect the UA to deal with turning off the loud
engine noise when I switch to another tab.

At the very least, I think you'd need this API to answer back to the app
developer what is ACTUALLY going to happen (a la the Page Visibility API),
not just have the app developer state their intent and leave it up to the
UA - or you are pretty much mandating UA behavior (and unevenly, across Web
Audio and other media APIs).

That's true.  Perhaps we need, in addition to the intent, a set of events
which app developers can listen for.



Sorry about that - need to switch my default to reply-all, I think.

-Chris

On Wed, Feb 29, 2012 at 5:46 PM, Chris Rogers <crogers@google.com> wrote:

> Hi Jer, interesting questions.  I see nobody has chimed in yet, so I will!
>
> On Tue, Feb 28, 2012 at 3:24 PM, Jer Noble <jer.noble@apple.com> wrote:
>
>> Hi all,
>>
>> I would like to raise an issue with the WG which has come up during our
>> (Apple's) implementation of the WebAudio API.
>>
>> We were trying to decide what the correct behavior of the API should be
>> under various system scenarios. A few (entirely hypothetical) examples:
>>
>>
>>    - Should the audio from an AudioContext continue to play when the
>>    owning tab is not frontmost?
>>
>> I think a specific user agent could implement this differently, but
> muting audio on a hidden tab *might* be a reasonable default.  I guess this
> corresponds to your "foreground" mode?  I would make the distinction here
> between hidden tab and a visible window which is not front-most.  Perhaps
> the page visibility API comes in here?  One possible answer to your
> question is that it should be the author's decision whether to mute the
> audio (by stopping playback or programmatically turning down gain) based on
> events from the page visibility API, and that no hints in the AudioContext
> constructor are needed.
>
>
>
>>
>>    - Should an AudioContext be allowed to play without being initiated
>>    with a user gesture?
>>
>> I hope the answer is yes.  I think it would be unacceptably cumbersome in
> the common/default case.  In most user agents, an <audio> element is
> allowed to play normally without user permission.  I think most people
> would expect to be able to visit a page and hear audio in a simple and
> straight-forward manner.  The one exception, I suppose, is the "exclusive
> access" mode which you describe next:
>
>>
>>    - Should audio from an AudioContext cause other audio on the system
>>    to halt?
>>
>> I don't think it should, at least not when running in a desktop browser.
>  Using Mac OS X as an example, each application is allowed to generate
> audio, and it's up to the user to control playback in each.  For example,
> iTunes may be playing music at the same time as a user is playing a game in
> another application, with the audio being mixed.  I understand that you may
> be approaching this from the iOS perspective which is slightly different
> than the desktop OS, so maybe there could be special hints to request
> exclusive access as you suggest.  But, in general, running on a desktop OS,
> it may be difficult to gain exclusive access to the audio hardware, and
> even if possible would be considered annoying - at least that's my feeling.
>
> So in short, it shouldn't be the default behavior, but maybe have your
> special hint which requires permission from the user as you suggest. I'm
> not sure that the hint could be respected for all user agents.
>
>
>
>>
>>    - What should happen when AudioContexts from two separate pages
>>    decide to play at the same time?
>>
>> That's a very interesting question, considering that we're not talking
> about hidden tabs, but two windows side-by-side and both visible. I think
> the default behavior should be that both generate sound.  I've often
> *intentionally* played around with two web audio pages playing
> simultaneously.  There are definitely real-world uses for that, and I think
> it should be the default behavior.
>
> On iOS Safari I don't know if there's a concept of multiple visible pages,
> or only one visible page and hidden tabs.  If there's just one visible page
> then I guess that's the case of your first bullet point (above).
>
>
>
>>
>> These are all UA decisions which will probably vary greatly from platform
>> to platform, but given the wide breadth of use cases which this API will
>> enable, we found it very difficult to decide what the correct default
>> behaviors should be without more information about what use case is being
>> attempted.  For some use cases, the behavior is pretty clear:
>>
>>
>>    - If the use case is simulating UI sounds (*beeping* when the user
>>    clicks a button), audio should mux with system audio and other in-page
>>    audio.  Limiting audio to only visible tabs would be fine here.
>>
>> Seems like a reasonable "default" case.
>
>
>
>>
>>    - For background music, the UA might want to halt or duck other
>>    system level background audio (by pausing Window Media player or iTunes,
>>    for example).  Perhaps the UA would want to limit background audio to only
>>    a single instance at a time.  The UA might still want to mute this audio
>>    when the user switches to another tab.
>>    - For the podcast case, the limitations might be the same as above,
>>    but might additionally want to relax the frontmost tab restriction.  In
>>    return, the UA might insist that the AudioContext be created during a user
>>    gesture, or by displaying a permission banner.
>>
>>
>> So I'd like to propose the addition of an optional parameter to the
>> AudioContext constructor, entitled, for the lack of a better word,
>> "intent".  I'll illustrate with a hypothetical Web Audio API use case, a
>> web app which recreates the LCARS UI from STtNG<http://en.wikipedia.org/wiki/LCARS>,
>> and the restrictions and privileges a hypothetical UA might impose and
>> grant:
>>
>>   *Intent*
>>  *Privileges*
>>  *Restrictions*
>>  *Use Case*
>>   "foreground"
>>  None.
>>  Muted when not in frontmost tab/window.
>>  Play a subtle "chime" when the user clicks or touches various UI
>> elements.
>>   "background"
>>  Mutes other "background" AudioContexts.
>>  Muted when not in frontmost tab/window.
>> Must begin with a user interaction.
>>  Play the gentle hum of a warp engine.
>>   "media"
>>  Mutes other "media" and "background" AudioContexts.
>> Pauses other system media playback.
>> Plays when not frontmost.
>>  Must get explicit permission from the user.
>>  Play the theme song of STtNG.
>>
>> In the absence of an "intent" argument, the UA would just pick a default
>> intent, probably the one with the fewest privileges and restrictions.
>>
>> What do people think?
>>
>> -Jer
>>
>
>
Received on Friday, 2 March 2012 20:12:45 UTC