- From: Gregory J. Rosmaita <unagi69@concentric.net>
- Date: Thu, 15 Jun 2000 17:37:14 -0400
- To: User Agent Guidelines Emailing List <w3c-wai-ua@w3.org>
- Cc: Janina Sajka <janina@afb.net>, Peter Verhoeven <pav@oce.nl>, kerscher@mail.montana.com
i propose that we adopt the average maximum speech rate of four of the most commonly used -listed software synthesizers as the minimal requirement for the minimal upper range for the synthesized speech rate Eloquence for JAWS (Eloquent Technologies) slowest: 100 words per minute (can increase by 1 wpm increments or by 70 wpm increments fastest: 800 words per minute ViaVoice Outloud (IBM) slowest: 70 words per minute fastest: 700 words per minute (increases in 10 wpm increments) Orpheus (Dolphin Access) slowest: 10 words per minute fastest: 700 words per minute Microsoft Speech Engine (Microsoft) slowest: 100 words per minute fastest: 510 words per minute the average works out to be 677.5 words per minute, which is akin to having 2.1 children, so i propose that we adopt 700 words per minutes as the minimal upper range for synthesized speech Note: control over pitch has a significant effect on the intelligibility of faster speech rates; pronunciation rule bases play a role, as well -- because British English has more clipped pronunciation than does American English (at least as synthesized by software speech engines) i am able to listen to content at a far faster rate using a British English rule base, than i am when i use an American English rule base when using a software synthesizer. ultra-fast American English is best accomplished (at least, to my ears) by hardware synthesizers... most people who regularly interact with computers in an exclusively aural manner tend to do so at accelerated rates, at least 425 to 575 words per minute in most situations, although many set discrete settings so that prompts, menus, and the keyboard echo is spoken at the maximum possible rate allowable... of course, the nature of the content being reviewed also plays into how fast one sets one's synthesizer -- when i listen to the newspaper, i tend to go as high as 650 (or 675, if i'm just taking a cursory listen to the sports page!) on the other hand, if i'm tired and listening to source code with punctuation turned on, i might dip down as low as 275, but listening to slower rates is really draining -- it's the aural equivalent of trying to read a marquee that is scrolling a syllable or a character at a time... a lot of speech users also set (or are able to set, due to the speech engine they are using) a discrete rate for the "edit" area, so that, for example, in a word processor, when you listened to your document, it would be read at a reasonable rate, perhaps somewhat slower than average, perhaps significantly so i'd be glad to factor into the equation the ranges of other commonly used software speech engines, particularly those available for linux, and adjust the minimal requirement accordingly... i'd also be interested in the range rate available to users of screen magnification programs that provide native support for supplemental speech, as well as the rates available for supplemental speech synthesis targeted at dyslexics who process information most efficaciously when it is presented aurally, which is something RFB&D might be able to assist us with... gregory. ------------------------------------------------------------------- ACCOUNTABILITY, n. The mother of caution. -- Ambrose Bierce, _The Devil's Dictionary_ ------------------------------------------------------------------- Gregory J. Rosmaita <unagi69@concentric.net> Camera Obscura <http://www.hicom.net/~oedipus/index.html> VICUG NYC <http://www.hicom.net/~oedipus/vicug/> Read 'Em & Speak <http://www.hicom.net/~oedipus/books/> -------------------------------------------------------------------
Received on Thursday, 15 June 2000 17:51:05 UTC