W3C home > Mailing lists > Public > w3c-wai-ua@w3.org > April to June 2000

voice modes used for orientation (was Re: Raw minutes of 15 June UA Guidelines)

From: Gregory J. Rosmaita <unagi69@concentric.net>
Date: Sun, 18 Jun 2000 00:17:01 -0400
Message-Id: <4.3.1.2.20000617034520.00c9ae50@127.0.0.1>
To: Harvey Bingham <hbingham@ACM.org>
Cc: User Agent Guidelines Emailing List <w3c-wai-ua@w3.org>
aloha, harvey!

in response to my minuted comments:

quote
          GR: Frequently, there are about 6 different voices used
                 for orientation.
unquote

you asked,

quote
Gregory, I'd like that list of uses included, in the note, as recognizably
useful distinctions that voice characteristics can provide.
unquote

as a general rule, screen-readers allow users to set distinct vocal 
characteristics -- which are roughly akin to the "Appearance" property 
sheet of the "Display Properties" available to users via the Control Panel 
in the Windows environment -- as an orientational mechanism, that is 
capable of instantly communicating to the user the context in which he or 
she is working and/or the source of the synthesized speech being spouted at 
him or her...:

one of the main uses of these differentiation mechanisms is to distinguish 
whether the application cursor or the speech cursor is active...  the 
speech cursor provides a gross navigational mechanism that not only allows 
the user to grope about available screen space in order to reconnoiter the 
application window, but which usually also serves to move the pointing 
device's point-of-regard, which is often necessary to activate or 
deactivate an object or discrete area of the screen in the absence of a 
keyboard equivalent, or when the sub-window fails to receive focus, isn't 
keyboard focusable, or is a custom control which neither the application 
nor the screen reader recognize as a control, but simply as a graphic...

each "voice mode" contains a range of vocal characteristics (including, but 
not limited to, volume,  rate, person, pitch and punctuation verbosity, 
which can be usually further sub-divided into "All", "Most", "Some", or 
"None"), in order to provide as broad a range of individual tailoring as 
possible -- some users, for example, prefer to only switch genders as an 
orientation mechanism, some switch only the "cutely named" synthesized 
voice, some solely the pitch or rate, but most use a combination of the 
configuration options available to them, so as to provide as instantaneous 
an orientational mechanism as possible...

the 6 most voice modes are:

         Global
         Keyboard (i.e. keyboard input echo vocal characteristics)
         Application Cursor
         Speech Cursor
         Messages
         Prompts

note that some screen-readers treat "Messages" (such as announcing "Page 5 
of 15" when one moves across a page boundary in a word processor) and 
"Prompt" (labels attached to controls) as a single entity, while others 
offer a wider range of flexibility...

during the teleconference, i mentioned another vocal characteristic, 
Uppercase Indication, which, while (usually) not a discrete voice mode, is 
a voice characteristic which is often grouped with the voice modes listed 
above....  some synthesizers offer only incremental control over pitch, 
others issue earcons (usually in the form of a tone for a capital letter or 
a double tone to indicate all caps), or say "cap" or "all caps", or some 
combination of the 3...

note: the information contained in this emessage is generalized from my 
personal and professional experience with screen readers, primarily in the 
Windows and DOS environments, although i did double-check my facts using 
the 4 Windows and 5 DOS-based screen readers which i have loaded on my 
laptop...  i have also been fortunate enough to use both emacspeak in 
real-life situations and to play around a bit with outSpoken on a 
mac...   while the outSpoken approach is similar to that employed by 
Windows-based screen readers, both emacspeak and aster employ spatial 
effects as orientational vocal characteristics, whereas most other speech 
synthesizers which do support spatial effects do so mostly for novelty's 
sake (one offering a female voice "in a hall", "in space" and "in an 
auditorium")...

gregory
-------------------------------------------------------------------
ACCOUNTABILITY, n.  The mother of caution.
                         -- Ambrose Bierce, _The Devil's Dictionary_
-------------------------------------------------------------------
Gregory J. Rosmaita      <unagi69@concentric.net>
Camera Obscura           <http://www.hicom.net/~oedipus/index.html>
VICUG NYC                <http://www.hicom.net/~oedipus/vicug/>
Read 'Em & Speak         <http://www.hicom.net/~oedipus/books/>
-------------------------------------------------------------------
Received on Sunday, 18 June 2000 00:30:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 06:50:03 GMT