Fwd: Screen Reader Audio on Mobile from David MacDonald on 2015-10-06 (public-mobile-a11y-tf@w3.org from October 2015)

From: David MacDonald <david100@sympatico.ca>
Date: Tue, 6 Oct 2015 12:27:12 -0400
To: "public-mobile-a11y-tf@w3.org" <public-mobile-a11y-tf@w3.org>
Message-ID: <CAAdDpDbjT_Nt6rTCWQ-vx7CBtv=kYZLrPdXRET5zDcV8Au1Uyw@mail.gmail.com>
Hi Team

Please consider this request by Janina, the chair of the Protocols Working
group regarding her experience as a blind person on an iPhone. It is really
a user agent issue but we may want to take up this cause on the user agent
sside of things.



---------- Forwarded message ----------
From: Janina Sajka <janina@rednote.net>
Date: Mon, Oct 5, 2015 at 8:10 PM
Subject: Screen Reader Audio on Mobile
To: David MacDonald <david100@sympatico.ca>


Hi, David:

In a telephone conversation some weeks hence, you asked that I provide
you my architectural concerns about why TTS audio on mobile should be
managed as a separate audio channel. My apologies for taking so long to
provide this to you!

BACKGROUND:     There are multiple sources of audio on mobile devices.
These sources include incoming audio in a telephone call, music that a
user might have stored in nonvolatile memory on the device, audio for
movies (and other video) stored on, or streamed to a mobile device,
sounds for use as alarms (or other audible markers of system activity),
and synthetic Text to Speech (TTS) engines commonly used with screen
readers by users who are blind. Combining these various sources into a
single audio presentation that can be heard when a phone is held to one
ear, or that can play in stereo through a pair of speakers on a tablet
device, is
called "mixing" by audio processing professionals. When and how this
mixing occurs, however, impacts the level of reliability and performance
the user will experience. It is my contention here that the screen
reader user is porrly served on today's devices as a result of
inadequate audio design considerations.

THE PROBLEM:    Both IOS and Android lump TTS audio together into what's
called the same "audio channel." This unfortunate architectural design
decision creates functional problems for users who rely on TTS
including:

*       It's harder to independently manage TTS volume vis a vis volume
*       of other audio events. This is true also for other audio
*       characteristics such as eq, panning, etc.

*       It's impossible to independently direct audio output. If the
*       user wants movie audio to go to an external Bluetooth soundbar,
*       she must accept that the TTS will also now be heard via those
*       same Bluetooth speakers. This makes no sense from a
*       functionality perspective inasmuch as the TTS is ostensibly part
*       of a highly interactive user interface paradigm, whereas the
*       movie audio is simply presentational. Lag times for TTS matter a
*       lot, but for movie audio only synchronization with the video
*       matters.

*       It's impossible for TTS events to be properly prioritized when
*       they're lumped together with other audio events this way.
*       Because TTS is part of a highly interactive user interface, its
*       system priority should always remain quite high, and should
*       remain closely correlated to on screen touch events. This breaks
*       down when prioritization is driven by playback of presentational
*       audio such as music or movie/video sound tracks. One result of
*       such inappropriate prioritization is also the poor performance
*       of DTMF after the completion of a telephone call. Both IOS and
*       Android are very poor at this for reasons of inappropriate
*       system event prioritization.

THE SOLUTION:   With today's more powerful, multi-core CPUs, and with
today's independent audio management subsystems, it's perfectly
reasonable to request a better architecture for TTS dependent user
interfaces. The TTS used by screen readers should be managed in its own
audio channel until the final mixdown for speaker/headset presentation.

Thank you, David, for conveying this concern to the Mobile A11y TF on my
behalf.

Janina



--

Janina Sajka,   Phone:  +1.443.300.2200
                        sip:janina@asterisk.rednote.net
                Email:  janina@rednote.net

Linux Foundation Fellow
Executive Chair, Accessibility Workgroup:       http://a11y.org

The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI)
Chair,  Protocols & Formats     http://www.w3.org/wai/pf
Received on Tuesday, 6 October 2015 16:27:41 UTC