VoiceXML 2.0 Last Call - accessibility comments

status:  These comments reflect the sense of the meeting in a review of the Last Call draft by the Protocols and Formats Working Group on behalf of the WAI.  As with any comments, these are discussable and we would welcome the chance to work with you on arriving at a resolution that works for all.

Al

Global reference:  

http://www.w3.org/TR/2002/WD-voicexml20-20020424

** CHANGE REQUESTS

1.  Text equivalents for all recorded-speech prompts should be required as a validity condition of the format.  These make the difference between a dialog where access by text telephone, for example, is readily achievable and where it quite difficult to achieve. 

Rationale:

Text telephones are widely used by people who are Deaf or Hard of Hearing to access the services that others access by voice telephony.

The dialog design of a voice dialog would work in a text-telephone delivery context, so long as the dialog elements are available as text.

2. Voice is center, not limit, of domain of application.

[reference: Appendix H]  

The appendix paints the applicability of this technology too narrowly.  Even in cases where the dialog designer only thinks in a voice dialog context, the resulting dialogs have been thought through thoroughly and are expected to be, for example,
- highly usable as transcoded into a text-telephone delivery context
- likewise as transcoded into a Braille environment for those who are both deaf and blind.

The idea that this technology is 'final form' should be eliminated and the language brought more in line with that in section 2.1 of the latest Member draft of the Speech Recognition Grammar Specification where it touches on this point.

3.  Completeness of key-access to function.  Control of the application through key-presses by way of DTMF catches allows people with unrecognizable speech access to the application.   Can the format specification enforce complete functionality in this mode of interaction?  If it can, the group should consider requiring this.

If not, we should discuss a lower minimum requirement including orientation to alternate modes of accessing the same operational service through another channel of communication (like giving an 800 number on your website).

** DESIRABLE OUTCOME CLEAR, SPECIFICATION IMPACT LESS CLEAR

The following items discuss application-design concepts which contribute to the effectiveness of VoiceXML in the delivery contexts used by people with disabilities.  Some of these should perhaps be reinforced with specification provisions.  How much of this becomes technical requirements in the specification, usage advisories in the specification, and/or guidelines published outside the specification is less clear.

4.  Always define a global ENTER user-action.

[reference: 3.1 Grammars]  

The functionality intended here is that there is something the user can do that serves as an executeImmediate or justDoIt verb.  Comparable to the use of the ENTER key on desktop keyboards.  This could be 'Yes, please' in an English speech catch grammar, the hash (#) key in DTMF, or what you like.

This mode of operation is used by people with severe motor limitations in accessing computer applications.  The same "wait and act" mode of user action that they use there is adaptive here in the same personal circumstances and for the same reasons.

http://www.abilityhub.com/switch/

Key-bindings that might make sense to standardize in this specification include this function and zero (0) for Help.

5. Regular, proven and familiar dialog structures.

[reference: 3.1 Grammars]  

The above specific device is a special case of a broader issue.  This has to do with creating simple, regular navigation structures that are highly usable and leverage learning across multiple applications that use the same best practices.

Compare with

 http://www.w3.org/TR/WCAG10/#gl-facilitate-navigation

In addition to the general concept set forth in this guideline, the following two examples of reference designs which have been developed for delivery contexts that stress the mnemonic appeal of the dialog flow are worth noting:

Website design for those with severe learning disabilities:

 http://www.learningdisabilities.org.uk/html/content/webdesign.cfm

Note in particular the five global functions in this dialog design.

Navigation modes for the ANSI/NISO X39-86-2002 Digital Talking Book Standard.
Start at

 http://www.loc.gov/nls/niso/

Some preliminary experience with designing VoiceXML applications with this as the general operational model have been highly encouraging.

6.  Complete safety net.

[reference:
1.3.5 Events
1.5.4 Final Processing

5.2 Event Handling
	5.2.2 Catch
	5.2.4 Catch Element Selection
	5.2.5 Default Catch Elements]

Each element in which an event can occur SHOULD specify catch
elements, including one with a fail-soft or recovery functionality.
             
      Examples:
          <catch event="noinput">
		<reprompt/>
	    </catch>
	    <catch event="nomatch">
	      <audio>
               I am sorry.,.I did not understand your command. Please
re-enter your key choice.
            </audio>
		<reprompt/>
	    </catch>

                                  
          <catch event="help">
             Please say visa, mastercard, or amex.
          </catch>

7. Timeouts

[reference:
Appendix D - Timing Properties
4.1.7 Timeout]

People with disabilities sometimes need a little extra time to respond or complete an input action.  Generous time allowances should be available.   Prompt user that
the timeout will expire, give option to extend time. 

Making extra time available as an ask-for option may be the most effective way to a) keep the application accessible to those who need it without b) impairing the functionality for others through excessive delays.

8.  Human or other option.

[reference:
1.5.2 Executing a Multi-Document Application
5.3.9 EXIT]

Advertise alternate modes through which comparable service is available.  This may involve transfer to a human operator, text telephone service through transfer or re-dial, etc.
Particularly during final processing as safeguard.

   Examples:
	Specify (or allow for) a "wait" or "help" for barge-in during
final processing, with an 	option to <transfer> to a human
operator.

            "Goodbye" (computer)
            "wait....wait!" (human)
            "Would you like me to repeat your new account number?"
(computer)
            "Yes....I didn't get it the first time" (human)
            "The new account number established during this call for
Mary Jane Jones, is 6652281. Does this answer all your questions?"
(computer)
            "Yes" (human)
            "If you need to speak with an operator, just say "Operator".
Would you like to transfer to an operator?" (computer)
            "No" (human)
            "Thank you for using the National Bank's automated voice
system. Goodbye"
 
** DON'T LOSE THESE KEY FEATURES

We couldn't resist the following affirmations.  Please take these as a "Not just yes, but He** yes!" response:  These features and design aspects will prove critical in disability access situations:

9.  Layered help.

[reference:
2.3 Form Items
2.5 Links
3.1.1.3 Grammar Weight
4.1.6 Prompt Selection
5.2 Event Handling
	5.2.2 Catch
	5.2.5 Default Catch Elements
	5.2.6 Event Types
5.3 Executable Content]

This is good.  Thank you.  Get people to use it.  Example:

  Example:
     aMenu[1].items[0].helptext = "If this is the entry you want please
press the pound key";
     aMenu[1].items[0].morehelptext = "Press 1 to start over";

10.  Application-scope grammars. 

Best to specify application-scope form grammars in the root of a
multi-document application.

IMPORTANT: It is stated in the document body , as well as in the
"Clarifications" section why this is a good idea in general.  This goes redoubled for 
accessibility.
  

Received on Friday, 31 May 2002 15:27:31 UTC