W3C home > Mailing lists > Public > w3c-wai-ua@w3.org > October to December 2000

Reading at variable speeds, NISO/DAISY

From: Harvey Bingham <hbingham@acm.org>
Date: Thu, 30 Nov 2000 00:36:49 -0500
Message-Id: <>
To: w3c-wai-ua@w3.org
Extract from the joint NISO/DAISY Proposed Standard.

Digital Talking Books Standards Committee
Navigation Features List
NISO Digital Talking Book Standard on Navigation


Draft 4, December 29, 1999

[HB Comments.]

3. Reading at Variable Speeds

It should be possible to read the digital talking book at speeds that are 
faster than or slower than the normal listening rate. This variable speed 
feature is necessary to enable playback at a speed that is comfortable and 
efficient for a wide range of readers. Three times the normal "real-time" 
rate should be possible, and the slowest speed should be around 1/3 the 
real-time reading rate.

[HB This doesn't assert what the normal rate is. UAAG 4.12 only places it
between 120 and 400. My guess is about 180 words per minute. This does not
address how a user dynamically should be able to adjust that rate.]

The device should offer the user the option of "Time-Scale Modification" 
(TSM), that is, the capability to maintain constant pitch while the 
playback speed is varied. This feature should be optional, however, so that 
the user can choose to have the pitch change as the playback speed changes. 
The TSM system should not produce audible chopping, burble, or 
reverberation and should not skip over significant units of sound at high 
playback speeds.

[HB UAAG places no requirement on TSM, maintaining constancy of fundamental 
voice pitch (but still allowing inflection.) The texts above seem 
appropriate for notes, properly linked. There is no accommodation to the 
differences in
TSM rate with fundamental voice pitch: a female voice already has higher
frequency, so cannot be speeded up so much without loss of plosives that
depend on those high frequencies, particularly for someone like me with
high frequency hearing loss. A male voice takes longer to get such plosives
started, so sampling them may miss or otherwise distort those plosives,
even though they should fall within the high-frequency cut-off of the
hearing loss. I'm uncertain of these assertions, intuited from physics.]

[HB: Mickey and Gregory, can you clarify please? Do you favor male voice for
normal narrative, and use female voice for injections of non-running text?
Do you choose a default rate depending on the gender (as surrogate for the
fundamental frequency of the voice?]

Regards/Harvey Bingham
Received on Thursday, 30 November 2000 01:52:45 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:38:29 UTC