Re: Review of Web Audio Processing: Use Cases and Requirements from Michael Cooper on 2012-06-29 (public-audio@w3.org from April to June 2012)

From: Michael Cooper <cooper@w3.org>
Date: Fri, 29 Jun 2012 10:24:12 -0400
To: Doug Schepers <schepers@w3.org>
CC: public-audio@w3.org, WAI Liaison <wai-liaison@w3.org>
Message-ID: <4FEDBA8C.9040704@w3.org>
Hi Doug - I appreciate your all's responsiveness. I've made a few
response inline.

I'll top-post a general comment that I don't know the extent and limits
of the intended scope of this work. My feedback was based on what I
could see there. Some of the issues might need to be addressed outside
of the work this group is targeting (e.g., by user agent features around
the API support, or by content authors). However, in my experience, it's
best to consider and at least flag user agent implementation and Web
application best practices even in the design of low-level APIs. We need
to understand how a user agent or content author is expected to play a
role in solving a given problem in order to decide whether a feature is
needed at the API level to support that. So yes, much of what I wrote
about might be addressed by user agents and content authors, but we
shouldn't presume too early that the entire solution is there. Also, we
need to make sure there are flags for those issues so they don't get
overlooked later on in the process, so in my mind a "implementation
note" or something is appropriate in various places, even if the design
of the API is not itself changed because of a particular requirement.

Doug Schepers wrote:
> Hi, Michael-
>
> Thanks for sending this review.
>
> We will discuss this at the next telcon (and on this list).  In the
> meantime, I have a few personal comments inline...
>
> On 6/27/12 3:04 PM, Michael Cooper wrote:
>> Doug sent a request to Protocols and Formats Working Group to review Web
>> Audio Processing: Use Cases and Requirements
>> https://dvcs.w3.org/hg/audio/raw-file/tip/reqs/Overview.html. The PFWG
>> is not able to assemble a consensus group review in a quickish
>> turnaround right now, so I am sending my comments individually. I expect
>> other members of PFWG to submit additions later. The version of the
>> document I reviewed was accessed 25 June 2012.
>
> I think it would be good to have a joint telcon, just to get on the
> same page.
I'm sure we can arrange that. I've made a note to follow up on that off
list.
>
> My general comment, as in the IndieUI WG, is that instead of putting
> forth requirements, you should suggest Use Cases, from which we derive
> requirements.  I can help with that.
Ok, I'll accept your help. My first reaction is that I don't know how
most of what I suggested would be expressed as use cases. When I review
for accessibility, I'm usually looking for ways that we don't introduce
problems for people with disabilities. That seems to be more a
requirement, i.e., "don't break X". The use case would be some anti
thing like "person using AT does X and doesn't have a problem because of
Y".

As I said, I do see the need to write up a use case for a screen reader
user who is receiving audio from an external application as well as
potentially from the operating system, and also interacting with Web
page audio. I'm hoping someone in PF can help me write that one up, I
was just bookmarking it (my comments were originally sent to the PF list).

I suppose it would be possible to create use cases around some of my
other requirements, but I really don't know they would improve clarity
of the requirements.
>
>
>> My review focuses primarily on suggesting requirements for features that
>> would improve accessibility. I haven't proposed use cases but do think
>> we need to develop a use case that explains a user with a screen reader
>> and who depends on audio cues from the operating system, who is also
>> interacting with Web application audio as proposed in the other use
>> cases.
>>
>> Need a requirement to allow both per-track and global (with the Web
>> application) control of each audio channel / track / whatever for
>> volume, mute, pause / restart, and autoplay. I think this is implied by
>> the use cases but not spelled out in the requirements.
>
> Seems reasonable, though I think maybe more details would be helpful.
>
>
>> Suggest a requirement to allow audio channels to be designated as
>> related to each other (e.g., a voice and instrumental overlay) so a
>> volume change or pause of one of them affects all of the related ones,
>> to allow them to stay in sync, yet allowing "unrelated" tracks (e.g.,
>> sound effects) to be treated independently.
>
> This is interesting; I don't know if this should be addressed at the
> webapp level, or the API level.
>
>
>> Need a requirement that audio controls not affect system audio settings
>> i.e., should have impact just for audio under control of the Web
>> application. For instance, it would be highly problematic if a "mute" in
>> the Web application muted all system audio. If other requirements mean
>> system audio will be controllable from Web applications (this is not
>> clear to me one way or the other right now), then instead the
>> requirement is that users have an easy way to control whether to allow
>> system audio settings to be impacted by in-application changes (e.g.,
>> via a user preference option in user agent).
>
> This one is more challenging.  In a traditional browser, there is no
> way for the API to do that anyway, so there's no need to spell out the
> requirement. In a "system OS" browser (something like Boot2Gecko),
> there may not be any distinction between the browser and the OS, so
> this requirement proposal doesn't make sense there either.
I agree that *current* traditional browsers don't provide a way for APIs
to know about system audio. But the work of this group is to provide new
functionality, and it may turn out to be a core requirement that
implementing user agents have this ability. At the very least, I think
the deliverable needs to explicitly say whether or not access to system
audio is expected, and then if so, describe the scope, mechanisms, and
limitations of that. If that is something that ends up varying among
implementations, there could be major interoperability issues with how
audio-enabled Web pages work in various user agents.

Some of the accessibility requirements I proposed essentially require
that the user agent at least know when certain types of sounds are
playing, such as screen reader ouptut and system alerts, so Web
application audio can avoid drowning it out. So from my perspective at
least some level of connection to system audio APIs is needed.

To make a case that this is not just an accessibility requirement, let
me describe a very short use case: a user is listening to a CD or MP3 or
something on their computer, and surfing the Web. They happen across a
site that automatically streams music, perhaps as samples of recordings
the user might purchase. If the music just starts playing, it will
overlap with the already playing music and provide a poor experience.
The Web page should be able to detect that audio is playing, and instead
give a message to the user saying "I have a music clip to play to you,
but you'll need to pause your other audio first". And depending on the
level of connection with the system we decide to have, the message might
also have the ability to provide a button, in the Web page, to pause the
audio being played by the desktop application, rather than requiring
them to switch applications.
>
>
>> The description of audio sprites brings up an issue but not sure of the
>> exact requirement. Sprites that play in response to certain actions
>> could prove extremely distracting to some users or could be more likely
>> to cause momentary problems with comprehension of screen reader output
>> etc. Users should therefore have the ability to prevent audio from
>> playing, not just stop it after it's begun playing. This may be a
>> requirement on Web applications, not on the audio APIs, but it might be
>> the APIs need to make this possible in some way. It would also be
>> helpful to come up with a small ontology of audio roles (for instance,
>> music, speech, sound effects, etc.) so users could easily prevent audio
>> of one type (e.g., sprites) from playing without preventing other types
>> (e.g, music) from playing. Perhaps also needed is a requirement on user
>> agents to offer a preference to users to allow audio to play
>> automatically or only on specific request, recognizing that setting this
>> preference this could interfere greatly with smooth function of some
>> types of applications.
>
> We've discussed the idea of a "global mute" of all Web Audio API
> sounds, and I agree this would be useful.
>
> To enable the scenario you lay out, though, the UA might provide a
> preference to do so, or even have its own volume control in the UI;
> but that is not a requirement on the API... it's a requirement on the
> UA itself.
Maybe so, but at least the need for this needs to be carried forward to
whatever document describes user agent implementation.
>
>
>> Are there issues with needing to provide a way for limits e.g., on total
>> volume when multiple tracks layered, or is this handled by audio
>> equipment? Wouldn't want combinatorial effects to create excessively
>> loud spots. Consider users who have audio volume higher than usual
>> because of hearing impairment, but still we can't allow eardrum-damaging
>> levels to come out.
>>
>> Need a requirement to provide ways to avoid triggering audio-sensitive
>> epileptic seizures. The fact that sounds from a variety of sources might
>> be combined, including script-generated sounds and transformations that
>> could have unplanned artifacts, mean the final sound output may be less
>> under the author's control than studio-edited sound. It is important to
>> find ways to reduce unexpected effects triggering audio-sensitive
>> epileptic seizures. To some extent this means warning authors to be
>> careful, but any features we can build into the technology, we should.
>> Unfortunately this is a new field to me and I don't know all the
>> specifics, so it will take research (which of course I volunteer to be
>> involved in, just looking for a placeholder for the issue now). A quick
>> scan online suggests that certain beat frequencies and reverberance
>> effects are known sources of problems. A set of user preferences
>> allowing users to disable or control certain Web application-generated
>> audio transformations might help with the latter issue.
>
> This one seems, on the surface, to be really challenging.
>
> I certainly acknowledge that this is an important issue, and that the
> implications are severe.
>
> I'm not sure that we can actually address this, though, or how we
> would do so.  I agree with you that if we can find someone
> knowledgeable about this, we should solicit their feedback on if there
> are ways to prevent it.  But just as Javascript could be used to
> change background colors at a rate and combination that could cause
> seizures, I'm not certain we could control how the output of the Web
> Audio API might do something similar... it's a general-purpose piece
> of functionality.
Yes, I don't come to you with solutions, just a flag. It may be there is
nothing to do at the API level on this. But I'm not ready to assume that
just yet. I'm quite sure part of the solution lies with implementing
user agents and content authors, and think flags about this need to be
carried up the chain. (Same comment for a number of other ones, so I
won't repeat)
>
>
>> Need a requirement that audio from the Web application not interfere
>> with system sounds (e.g., alerts), which may contain vital information
>> for the user. While it's probably not desirable to pause Web application
>> audio for system sound events, it's also not desirable to have system
>> sounds drowned out by Web application audio. User preferences may be
>> needed to indicate specifically how to handle the situation, but a way
>> for Web application audio to be aware of system sounds will be needed.
>
> Again, this would be a requirement on the UA, not on the Web Audio
> API.  A loud song or video in HTML could just as easily drown out
> system sounds, so this is a requirement at a different level of
> implementation than this API.
>
>
>> Operating systems have a feature called "ShowSounds", which triggers a
>> visual indication that an important sound like an alert has occurred.
>> Enabling certain types sounds, like audio sprites, to take advantage of
>> this feature may be important. I expect someone else to provide more
>> details on this requirement but wanted to put a placeholder in this
>> message.
>
> On first reading, this seems like something the Web Notifications WG
> should be addressing. Or if you are suggesting a browser-based analog
> of this functionality, that should be a requirement at the content
> level, not the Web Audio API level.
This will need more input from other PFWG people. In addition to the
questions of connection with system audio APIs above, this is something
that might come from accessibility APIs rather than system audio APIs. I
just don't know, myself. Whether a feature is needed at the Web audio
API layer is needed, I also don't know. I suspect some sort of feature
is needed to inform a Web application that it shouldn't or can't play
audio at the moment for a particular reason, and to alert it when that
situation changes.

Michael
>
> Regards-
> -Doug
>

-- 

Michael Cooper
Web Accessibility Specialist
World Wide Web Consortium, Web Accessibility Initiative
E-mail cooper@w3.org <mailto:cooper@w3.org>
Information Page <http://www.w3.org/People/cooper/>
Received on Friday, 29 June 2012 14:25:36 UTC