COGA Gap Analysis Issue Paper - Voice Menu Systems
Description of the technology or use case
Voice menu systems and Voice XML enable voice dialog systems and voice browsers. It is used for developing audio and voice response applications, such as telephone menu systems, banking systems and automated customer service portals. Users interact with voice browsers via the public switched telephone network (PSTN). These systems are very similar to other automated telephone menu systems and may issues overlap.
It is worth noting that many crucial systems are dependent on this technology such as emergency notification, healthcare or prescription refilling, and others. Therefore full accessibility needs to be supported.
See http://www.w3.org/TR/voicexml20/
An example use case may be as follows:
The user may be asked "For sports press 1, For weather press 2, For Stargazer astrophysics press 3." The system then waits for a response.
Accessibility is discussed for the hard of hearing, and WCAG and WAI specifications are cited as being relevant (see http://www.w3.org/TR/voicexml20/#accessibility) Beyond that, no examples or concerns are identified for cognitive accessibility.
Challenges for People with Cognitive Disabilities
Summary:
The general population tends to have difficulty this technology, but it can be very problematic for our user groups.
When useful you can break issues up into the specific categories below:
Effect of memory impairments
A good working term memory is essential for using these systems. The user needs to hold multiple pieces of transitory information in the mind such as the number that is being presented as an option, whilst processing the terms that follow.
A good short term memory (lasts seconds) is essential so that the user can remember the number or the term.
Without these functions the user will select the wrong number.
Executive function
Effect of impaired reasoning
The user may need to compare similar options such as "billing", "accounts", and "sales" and decide which is the service that is best suited to solve the issue at hand. Without strong reasoning skills the user is likely to select the incorrect extension.
Advertisements and additional, unrequested information also increase the amount of processing required.
Effect of attention-related limitations
The user needs to focus on the different options and select the correct one. A person with impaired attention may have difficulties maintaining the necessary focus for a long or multi-level menu. Advertising and additional, unrequested information also makes it harder to retain attention.
Effect of impaired language and auditory perception related functions
The user needs to interpret the correct terms and match them to their needs. This involves speech perception where the sounds of language are heard, interpreted and understood within a given time.
Effect of reduced knowledge
The user needs to be familiar with the terms used in the menu, even if they are not related to the service options required.
Proposed solutions
- It must be easy to get though to a human. We suggest that the digit "0" be reserved, across all Voice XML systems, to get though to a human.
- Some users need more time. Extra time should be a user setting for both the speed of speech and ability for the user to define if they need a slower speech, more input time, etc.
- Pauses are important between phrases to allow processing time of language and options.
- Options in text should be given before the digit to select, or the instruction to select that option. This means you do not need to remember the digit or instruction whilst processing the term.
- The digit 0 (or other value) could be a reserved number for reaching a human operator. Setting 0 (assuming that is the reserved digit) to mean anything else will give an error. The digit 9 (or other value) could be a reserved number for going back to the last menu (error recovery). Setting 0 or 9 (assuming that they are the reserved digits) to mean anything else will give an error.
- The user should be able to extend or disable the time out as a system default on their device.
- Error recovery should be simple, and take you to a human operator. Error response should not throw the user off the line or send them to a more complex menu.
Preferably a reserved digit should be used.
- A standard number used to repeat the current menu or the last 10 seconds of the menu would also be helpful to these user groups.
- Timed text should be adjustable (as with all accessible media).
- Advertisements and other information should not be read as it can confuse the user and make it harder to retain attention.
- Terms used should be as simple as possible.
- Examples and advice should be given on how to build a prompt that reduces the cognitive load.
- Example 1: Reducing cognitive load: The prompt "press 1 for the the secretary," requires the user to remember the digit 1 while interpreting the term secretary. It is less usable than the prompt "for the secretary (pause): press 1" or " for the secretary (pause) or for more help (pause): press 1"
- Example 2: Setting a default digit to reach a human operator.
Note. The above proposed solutions have not been tested. They are not supported in the standards.
Currently Voice XML does not seem to be used to improve accessibility for people with cognitive disabilities and in may case seems to decrease it. No author strategies are defined to improve this situation. Until this issue is addressed, many critical functions such as emergency and medical support, will become less and less accessible to people with cognitive disabilities.