W3C home > Mailing lists > Public > w3c-wai-ua@w3.org > April to June 2000

maximum & minimum speech rates for software synthesizers

From: Gregory J. Rosmaita <unagi69@concentric.net>
Date: Thu, 15 Jun 2000 17:37:14 -0400
Message-Id: <>
To: User Agent Guidelines Emailing List <w3c-wai-ua@w3.org>
Cc: Janina Sajka <janina@afb.net>, Peter Verhoeven <pav@oce.nl>, kerscher@mail.montana.com
i propose that we adopt the average maximum speech rate of four of the most 
commonly used -listed software synthesizers as the minimal requirement for 
the minimal upper range for the synthesized speech rate

Eloquence for JAWS (Eloquent Technologies)
         slowest: 100 words per minute (can increase by 1 wpm increments or 
by 70 wpm increments
         fastest: 800 words per minute

ViaVoice Outloud (IBM)
         slowest: 70 words per minute
         fastest: 700 words per minute (increases in 10 wpm increments)

Orpheus (Dolphin Access)
         slowest: 10 words per minute
         fastest: 700 words per minute

Microsoft Speech Engine (Microsoft)
         slowest: 100 words per minute
         fastest: 510 words per minute

the average works out to be 677.5 words per minute, which is akin to having 
2.1 children, so i propose that we adopt 700 words per minutes as the 
minimal upper range for synthesized speech

Note: control over pitch has a significant effect on the intelligibility of 
faster speech rates; pronunciation rule bases play a role, as well -- 
because British English has more clipped pronunciation than does American 
English (at least as synthesized by software speech engines) i am able to 
listen to content at a far faster rate using a British English rule base, 
than i am when i use an American English rule base when using a software 
synthesizer.   ultra-fast American English is best accomplished (at least, 
to my ears) by hardware synthesizers...

most people who regularly interact with computers in an exclusively aural 
manner tend to do so at accelerated rates, at least 425 to 575 words per 
minute in most situations, although many set discrete settings so that 
prompts, menus, and  the keyboard echo is spoken at the maximum possible 
rate allowable...  of course, the nature of the content being reviewed also 
plays into how fast one sets one's synthesizer -- when i listen to the 
newspaper, i tend to go as high as 650 (or 675, if i'm just taking a 
cursory listen to the sports page!)  on the other hand, if i'm tired and 
listening to source code with punctuation turned on, i might dip down as 
low as 275, but listening to slower rates is really draining -- it's the 
aural equivalent of trying to read a marquee that is scrolling a syllable 
or a character at a time... a lot of speech users also set (or are able to 
set, due to the speech engine they are using) a discrete rate for the 
"edit" area, so that, for example, in a word processor, when you listened 
to your document, it would be read at a reasonable rate, perhaps somewhat 
slower than average, perhaps significantly so

i'd be glad to factor into the equation the ranges of other commonly used 
software speech engines, particularly those available for linux, and adjust 
the minimal requirement accordingly...

i'd also be interested in the range rate available to users of screen 
magnification programs that provide native support for supplemental speech, 
as well as the rates available for supplemental speech synthesis targeted 
at dyslexics who process information most efficaciously when it is 
presented aurally, which is something RFB&D might be able to assist us with...


ACCOUNTABILITY, n.  The mother of caution.
                         -- Ambrose Bierce, _The Devil's Dictionary_
Gregory J. Rosmaita      <unagi69@concentric.net>
Camera Obscura           <http://www.hicom.net/~oedipus/index.html>
VICUG NYC                <http://www.hicom.net/~oedipus/vicug/>
Read 'Em & Speak         <http://www.hicom.net/~oedipus/books/>
Received on Thursday, 15 June 2000 17:51:05 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:38:27 UTC