- From: Gregory J. Rosmaita <unagi69@concentric.net>
- Date: Thu, 15 Jun 2000 17:37:14 -0400
- To: User Agent Guidelines Emailing List <w3c-wai-ua@w3.org>
- Cc: Janina Sajka <janina@afb.net>, Peter Verhoeven <pav@oce.nl>, kerscher@mail.montana.com
i propose that we adopt the average maximum speech rate of four of the most
commonly used -listed software synthesizers as the minimal requirement for
the minimal upper range for the synthesized speech rate
Eloquence for JAWS (Eloquent Technologies)
slowest: 100 words per minute (can increase by 1 wpm increments or
by 70 wpm increments
fastest: 800 words per minute
ViaVoice Outloud (IBM)
slowest: 70 words per minute
fastest: 700 words per minute (increases in 10 wpm increments)
Orpheus (Dolphin Access)
slowest: 10 words per minute
fastest: 700 words per minute
Microsoft Speech Engine (Microsoft)
slowest: 100 words per minute
fastest: 510 words per minute
the average works out to be 677.5 words per minute, which is akin to having
2.1 children, so i propose that we adopt 700 words per minutes as the
minimal upper range for synthesized speech
Note: control over pitch has a significant effect on the intelligibility of
faster speech rates; pronunciation rule bases play a role, as well --
because British English has more clipped pronunciation than does American
English (at least as synthesized by software speech engines) i am able to
listen to content at a far faster rate using a British English rule base,
than i am when i use an American English rule base when using a software
synthesizer. ultra-fast American English is best accomplished (at least,
to my ears) by hardware synthesizers...
most people who regularly interact with computers in an exclusively aural
manner tend to do so at accelerated rates, at least 425 to 575 words per
minute in most situations, although many set discrete settings so that
prompts, menus, and the keyboard echo is spoken at the maximum possible
rate allowable... of course, the nature of the content being reviewed also
plays into how fast one sets one's synthesizer -- when i listen to the
newspaper, i tend to go as high as 650 (or 675, if i'm just taking a
cursory listen to the sports page!) on the other hand, if i'm tired and
listening to source code with punctuation turned on, i might dip down as
low as 275, but listening to slower rates is really draining -- it's the
aural equivalent of trying to read a marquee that is scrolling a syllable
or a character at a time... a lot of speech users also set (or are able to
set, due to the speech engine they are using) a discrete rate for the
"edit" area, so that, for example, in a word processor, when you listened
to your document, it would be read at a reasonable rate, perhaps somewhat
slower than average, perhaps significantly so
i'd be glad to factor into the equation the ranges of other commonly used
software speech engines, particularly those available for linux, and adjust
the minimal requirement accordingly...
i'd also be interested in the range rate available to users of screen
magnification programs that provide native support for supplemental speech,
as well as the rates available for supplemental speech synthesis targeted
at dyslexics who process information most efficaciously when it is
presented aurally, which is something RFB&D might be able to assist us with...
gregory.
-------------------------------------------------------------------
ACCOUNTABILITY, n. The mother of caution.
-- Ambrose Bierce, _The Devil's Dictionary_
-------------------------------------------------------------------
Gregory J. Rosmaita <unagi69@concentric.net>
Camera Obscura <http://www.hicom.net/~oedipus/index.html>
VICUG NYC <http://www.hicom.net/~oedipus/vicug/>
Read 'Em & Speak <http://www.hicom.net/~oedipus/books/>
-------------------------------------------------------------------
Received on Thursday, 15 June 2000 17:51:05 UTC