- From: (unknown charset) Al Gilman <asgilman@access.digex.net>
- Date: Tue, 21 Oct 1997 20:15:52 -0400 (EDT)
- To: (unknown charset) w3c-wai-ig@w3.org
Or Ben-Natan just filed this on the HC list. Give it a look if you
can respond by tomorrow.
-- Al Gilman
http://www.access.digex.net/%7Easgilman/web-access/ACSS/MSExtensions.htm
-------------------------------------------------------------------------------
Microsoft Corporation
Proposal For Aural HTML Extensions
Prepared By : Or Ben-Natan
Create Date : 6/6/97
Status : Draft
Version : 0.9
Filename : AuralHTMLProposal.doc
Copyright © by Microsoft Corporation
All Rights Reserved
Amendments
Version
Author
Date
Change
0.9
Or Ben-Natan
6/6/97
Initial version
1 Overview *
2 Summary of requirements *
2.1 Propose user control over the rendering process *
2.2 Alternate media *
2.3 Navigation *
2.4 Forms and Input fields *
2.5 Error response *
3 Proposal for additional style sheet fields *
3.1 Define user control over the rendering process *
3.1.1 'InterruptSpeech' *
3.2 Offering Anchors and other input tags *
4 Additional Attributes For Other HTML tags *
4.1 Alternative content source *
4.1.1 VoiceFile *
4.2 Speech Recognition Grammar *
4.2.1 'GRAMMER' *
5 Events *
5.1 Error response *
5.1.1 OnSelectionTimeout *
5.1.2 OnSelectionError *
1 Overview 2
2 Summary of requirements 2
2.1 Define user control over the rendering process 2
2.2 Navigation 2
2.3 Forms and Input fields 2
2.4 Error response 3
3 Proposal for additional style sheet fields 3
3.1 Define user control over the rendering process 3
3.1.1 'InterruptSpeech' 3
3.2 Offering Anchors and other input tags 3
4 Additional Attributes For Other HTML tags 4
4.1 Alternative content source 4
4.1.1 VoiceFile 4
4.2 Navigation 4
4.2.1 'Select' 4
5 Events 4
5.1 Error response 4
5.1.1 OnSelectionTimeout 5
5.1.2 OnSelectionError 5
1 Overview 2
2 Summary of requirements 2
2.1 Define user control over the rendering process 2
2.2 Navigation 2
2.3 Forms and Input fields 2
2.4 Error response 3
3 Proposal for additional style sheet fields 3
3.1 Define user control over the rendering process 3
3.1.1 'InterruptSpeech' 3
3.2 Offering Anchors and other input tags 3
4 Additional Attributes For Other HTML tags 4
4.1 Alternative content source 4
4.1.1 VoiceFile 4
4.2 Navigation 4
4.2.1 'Select' 4
5 Events 4
5.1 Error response 4
5.1.1 OnSelectionTimeout 5
5.1.2 OnSelectionError 5
1. Overview
This document brings forward the Microsoft comments on the aural
style sheet, http://www.w3.org/Style/css/Speech/NOTE-ACSS together
with some other suggestions designed to improve the accessibility
of HTML to people with inherent of functional visual disability. .
The aural style sheet proposal focuses on the production of sound
when rendering text by text to speech engine. While this is a very
important area of focus we feel that this is not enough to address
the challenges of confronting voice based browser challenges.
Dealing only with output the current offering falls short of real
communication between the users and the browser. Users who cannot
see the text offered to them on the screen must be able to control
the manner by which content is rendered, select links and navigate
between pages and provide input such as order entry etc.
While it is possible to create a voice browser today, using the
existing HTML definitions, the result will be less than optimal as
the browser designer must make many assumptions, leaving no
control to the author of the application, or forcing the author to
learn another, proprietary configuration language.
OThe majority of our proposal may be implemented as additions to
the aural style sheet as well as additional attributes to existing
tags. No new tags are proposed. Those additions allow the author
to control the manner by which users who cannot have visual access
to the content displayed by the browser may fully interact with
it. In addition we propose the usage of media specific content
alternatives, or in more simple terms, the ability to provide a
voice file as an alternative to synthesized speech and the ability
to provide keyboard input definitions or spoken phrases as an
alternative to mouse selections.
Summary of requirements
1. ProposeDefine user control over the rendering process
When the browser renders tags it is important to define the level
of control the user will have over the process.
Users who frequently use the same WEB site normally know all that
is going to be said in the beginning of the page. This information
normally includes a welcome information and some instructions on
how to use the site. Invariably, users will want at some point to
skip those objects. The author of the application, on the other
hand, may want to limit this capability and force the user to
listen to anrecommend to the user to listen to the entire message.
The need to do this is in case there are new instructions,
promotions etc. We need a way for the author to specify whether to
allow the user to go to a next tag..
Alternate media
While text to speech technology is improving in quality over time,
the general listener experience still leaves much to be desired.
An alternative media representation such as a voice file is
necessary if professional quality audio is to be included in the
HTML page.
Navigation
Users want to use something other then the mouse to navigate. It
is impossible for people who are blind, for example, to see what
the mouse point to and select it. Rather they would prefer to
touch a key or say a word. This feature requires an HTML way of
specifying the input associated with each link. The input
specified may be a keyboard key or a phrase to be recognized by
speech recognition engines.
When speech recognition is used the author may need some over the
speech recognition parameters. This control includes pointers to
vocabulary, definition of sensitivity and more.
Forms and Input fields
We need a set of rules to define the way a browser does form
rendering. We need a way to define how input for input field is
solicited from users.
Graphical browsers place input fields in front of the users. The
user is compelled to provide the input by the very appearance of
the field on the display. Most people already know how to fill the
input fields - type in text, check a check box or select from a
selection list. In the case of voice browsers, the input or
navigation controls must be offered to the user.
Users of graphical browsers use the keyboard to provide input and
use the mouse to make selections. Voice browsers will have to
offer the user selection method through the keyboard as well as
selection through speech.
WEB authors must be able to specify what spoken phrases should be
used for the selection of links, radio buttons, check boxes, image
buttons, submit buttons, and selection lists. (Key access is
already provided by the accesskey attribute.
Error response
When using graphical browsers users select links and input objects
with the mouse. There can be no selection error. The object is
selected by clicking on it.
In a voice based browser it is easy for the user to enter
unexpected input or just stay there and enter no input at all. For
example, the browser may offer the selection of one of many anchor
tags using the keyboard, assigning a key to each anchor. Pressing
an unassigned key will be considered an error.
Authors must have control over the browser response to selection
errors and timeouts.
Proposal for additional style sheet fields
1. Define user control over the rendering process
1. 'InterruptSpeech'
Value: <String>| Yes|No
Initial: YES
Applies to: All elements
Inherited: Yes
InterruptSpeech controls the user's freedominform the user agent to
recommend that the user will not barge in and interrupt a message with
a voice or a keyboard selection. in stopping the speech rendering
process and move to the next input element.
Possible values to this attribute:
* Yes
* No
Browsers are free to define their behavior for InterruptSpeech. If
enabled, both keyboard and voice may be used to interrupt the speech.
Example:
<STYLE TYPE="Text/css">
P {InterruptSpeech : No}
</STYLE>
1. Offering Anchors and other input tags
When relying on text to speech engines rather then on pre-recorded
voice files, the offering of anchors and other input tags may be done
using the text associated with the anchor or with input tag (text may
be associated with input tags using the <label> tag).
For example.
<A href=driving.htm> Driving instruction </A>
May be offered by the voice browser using the following words:
"for driving instructions press D1"
The example shows how the phrases "For" and "Press 1" were added to
the text embedded in the anchor tag.
On first glance it looks as if this 'wrapper' text should be left for
the voice browser, but on further examination one can find problems
with this approach. For example, how will you offer the following
anchor tag?
<A href=LeaveMessage.htm> Leave us a message </A>
Speaking EnglishIn the English language you would rather say "To leave
us a message, press M1"
One may correctly assume that foreign languages will have even more
structures and special words whichwords, which apply to special cases.
There are several options to implement this feature. One is to assign
it a property in a style sheet. This may be a good idea because of the
way cascading style sheets effect entire documents from one place. It
is assumed in most cases one wrapper string will be used while a small
number of offering will sound different.
Another implementation idea is similar to the image map. In the case
of image map, the same mapping scheme may be applied to many maps in
the document. Assuming the browser may use its own default for most
cases, the author of the document may point each one of the small
number of offerings requiring a special link to a central location
with a different wrapper string definition.
1. Additional Attributes For Other HTML tags
1. Alternative content source
1. VoiceFile
Value: <URL>
Applies to: all elements
A voice file is an alternative source of content for the tag. For
example, a text paragraph may be rendered using a recording.
The value of VoiceFile is a URL pointing to the voice file.
A voice file may be used as an alternative to various elements
(e.g. an input field name, a table header (the <th> tag), a table
data <td>, etc.).
Speech Recognition GrammarNavigation
1. 'GRAMMERSelect'
Value: <ascii>?<string>
Applies to: All input related tags (<A>, <input> of all types
etc.)
The Select attribute allows a string to be used with the help of
speech recognition software for the selection of the input.
The Select string applies to anchors and input tags (of type
checkbox and radio buttons.)The GRAMMER attribute allows the
inclusion of a grammar block with an input tag. The grammar block
allows a speech recognition engine to analyze different type of
speech in a better way. At the present, the proposal does not
include the format of the block. This will have to be done in
coordination with the speech recognition industry.
For clarity purposes an example for the necessity of the GRAMMAR
attribute is provided here.
An HTML page may include a check box. The title of the check box
may be Are you an American Citizen.
A voice based user agent may ask the user, with the help of a text
to speech engine, "Are you an American Citizen"
The possible answers may be "Yes" or "No" but it could also be any
other word used for negative or positive respond in the callers
language. It could be "Ya," "you batchya," "sure," "of course" and
many other expressions. It is necessary to feed to the speech
recognition engine with many possibilities representing the
desired response.
2. Events
1. Error response
The action of error response is defined as an event in association
with a body of input or selection which includes the place where
the input is solicited from the user by means of talking to the
user and the place where the browser waits for the input.
Two types of error response are proposed. An error for a situation
where no selection is made or no input is entered and an error for
a case where a selection is made for something which is not
offered.
1. OnSelectionTimeout
The browser may generate OnSelectionTimeout event when the
user is asked to provide input of any kind such as a
selection from a list of anchors or an text input box and
fails to do so.
For example, the following block may be offered the user for
navigation.
<P onselecttimeout="browser.speakout ("You have not entered
any selection, please enter your selection now")">
<A href=Instructions.htm> Directions </A>
<A href=Todo> List of things to do </A>
</P>
The OnSelectionTimeout onselecttimeot event is processed by
the browser according to the browser own definition of
timeout for input entry or selection of anchor tags.
The OnSelectTimeout event applies to all block tags as well
as form elements.
2. OnSelectionError
When the user selects an option not offered by the browser the user
must be notified that an error occurred. The notification and the
resulting action is to be performed by a script associated with
OnSelectionError event.
Example:
<P onselectionerror="browser.speakout ("The selection you have entered
is invalid, please enter your selection again now")">
<A href=Instructions.htm> Directions </A>
<A href=Todo> List of things to do </A>
</P>
Received on Tuesday, 21 October 1997 20:16:10 UTC