Re: Seeking speech synthesizer min/max capabilities

National Information Standards Organization
Digital Talking Book Standards Committee
Document Navigation Features List
Status of this Document: This document is in draft status. Please send any 
comments to Michael Moodie at mmoo@loc.gov.
Draft 4 -- December 29, 1999

Extract pertinent to our recent discussions.

http://www.loc.gov/nls/niso/navigation.htm

1. Basic Navigation
Many of the navigation features which should be available in a digital 
talking book of the advanced variety will of necessity correspond to the 
navigation features available in today's personal computers. Blind people 
who are sophisticated users of screen access technology, word processors, 
or book reading software have already been exposed to many of the 
navigation features discussed here. Moreover, for purposes of discussion, 
it is assumed that users of the advanced digital talking book text 
navigation features possess a high degree of technological sophistication.

1.1 Basic Movement Through Text
The advanced digital talking book should provide the ability for the user 
to move through text one character, word, line, sentence, paragraph, or 
page (corresponding to the printed page, if present) at a time. In 
addition, the user should be able to jump to a specific page in the book 
(e.g., go to print page 55) and any specific line or paragraph on that page.
The user should be able to read the entire publication--from beginning to 
end--without having to jump up and down a hierarchical tree structure 
(e.g., moving in and out of the Table of Contents to go to the next chapter).
Another basic movement function that needs to be provided is time. The user 
should be able to move back and forth through the book using either a small 
(ten seconds, for example) or a large (e.g., ten minutes) time slice 
specified by the user.

1.2 More Sophisticated Movement
The user needs to have the ability to "jump" to specific chapters, 
sections, headings, and other segments of the digital talking book. For 
example, there should be functions such as "Go to next chapter," "Go to 
next subheading," "Go to next section," "Go to Chapter 5, Section 1," etc. 
This feature may be linked to a hierarchical, collapsible "Navigation 
Control Center" (discussed later), but then again, the user should have the 
ability to jump to a specific part of the book if its number or title is 
already known.

2. Fast Forward and Fast Reverse
It would be useful to have a simple tape-recorder-type navigation feature 
(cue and review function). For example, there could be a slider-like 
control or push buttons that would allow the user to fast-forward or 
fast-reverse through the book at a high speed. As the text was traversed, 
speech could be generated at a high speed using some form of time scale 
modification. Readers can learn much about the structure of the text that 
is passing. For example, lists can be detected as a series of short, 
staccato bursts. Paragraphs, chapter headings, etc. could be indicated by 
strategically-generated tones. Thus, an individual could just zip forward 
or backward through the book rather than typing commands to accomplish the 
same tasks. For some individuals, this interface would be much simpler and 
easier to use. It might also be much more useful in a document that is long 
and does not have particularly good titling or sectioning.
An alternative method of allowing the user to skim a document would be to 
have the playback device read the types of text elements that are passed. 
For example, the user might hear, "part, chapter, section, paragraph, 
paragraph,..., section, paragraph, paragraph,..., table, paragraph, 
paragraph,..., sidebar, etc."
It is recommended that the fast forward and reverse feature allow the book 
to be traversed anywhere from 10-25 times the normal or real-time reading 
speed.

3. Reading at Variable Speeds
It should be possible to read the digital talking book at speeds that are 
faster than or slower than the normal listening rate. This variable speed 
feature is necessary to enable playback at a speed that is comfortable and 
efficient for a wide range of readers. Three times the normal "real-time" 
rate should be possible, and the slowest speed should be around 1/3 the 
real-time reading rate.
The device should offer the user the option of "Time-Scale Modification" 
(TSM), that is, the capability to maintain constant pitch while the 
playback speed is varied. This feature should be optional, however, so that 
the user can choose to have the pitch change as the playback speed changes. 
The TSM system should not produce audible chopping, burble, or 
reverberation and should not skip over significant units of sound at high 
playback speeds.


Regards/Harvey Bingham

Received on Friday, 7 July 2000 18:42:42 UTC