W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > October to December 1997

(unknown charset) Aural extensions

From: (unknown charset) Al Gilman <asgilman@access.digex.net>
Date: Tue, 21 Oct 1997 20:15:52 -0400 (EDT)
Message-Id: <199710220015.UAA28030@access1.digex.net>
To: (unknown charset) w3c-wai-ig@w3.org
Or Ben-Natan just filed this on the HC list.  Give it a look if you
can respond by tomorrow.

-- Al Gilman


                           Microsoft Corporation
                     Proposal For Aural HTML Extensions
   Prepared By : Or Ben-Natan
   Create Date : 6/6/97
   Status : Draft
   Version : 0.9
   Filename : AuralHTMLProposal.doc
                    Copyright  by Microsoft Corporation
                            All Rights Reserved
   Or Ben-Natan
   Initial version
   1 Overview *
   2 Summary of requirements *
   2.1 Propose user control over the rendering process *
       2.2 Alternate media *
       2.3 Navigation *
       2.4 Forms and Input fields *
       2.5 Error response *
   3 Proposal for additional style sheet fields *
   3.1 Define user control over the rendering process *
       3.1.1 'InterruptSpeech' *
       3.2 Offering Anchors and other input tags *
   4 Additional Attributes For Other HTML tags *
   4.1 Alternative content source *
       4.1.1 VoiceFile *
       4.2 Speech Recognition Grammar *
       4.2.1 'GRAMMER' *
   5 Events *
   5.1 Error response *
       5.1.1 OnSelectionTimeout *
       5.1.2 OnSelectionError *
   1 Overview 2
   2 Summary of requirements 2
   2.1 Define user control over the rendering process 2
       2.2 Navigation 2
       2.3 Forms and Input fields 2
       2.4 Error response 3
   3 Proposal for additional style sheet fields 3
   3.1 Define user control over the rendering process 3
       3.1.1 'InterruptSpeech' 3
       3.2 Offering Anchors and other input tags 3
   4 Additional Attributes For Other HTML tags 4
   4.1 Alternative content source 4
       4.1.1 VoiceFile 4
       4.2 Navigation 4
       4.2.1 'Select' 4
   5 Events 4
   5.1 Error response 4
       5.1.1 OnSelectionTimeout 5
       5.1.2 OnSelectionError 5
   1 Overview 2
   2 Summary of requirements 2
   2.1 Define user control over the rendering process 2
       2.2 Navigation 2
       2.3 Forms and Input fields 2
       2.4 Error response 3
   3 Proposal for additional style sheet fields 3
   3.1 Define user control over the rendering process 3
       3.1.1 'InterruptSpeech' 3
       3.2 Offering Anchors and other input tags 3
   4 Additional Attributes For Other HTML tags 4
   4.1 Alternative content source 4
       4.1.1 VoiceFile 4
       4.2 Navigation 4
       4.2.1 'Select' 4
   5 Events 4
   5.1 Error response 4
       5.1.1 OnSelectionTimeout 5
       5.1.2 OnSelectionError 5
    1. Overview
       This document brings forward the Microsoft comments on the aural
       style sheet, http://www.w3.org/Style/css/Speech/NOTE-ACSS together
       with some other suggestions designed to improve the accessibility
       of HTML to people with inherent of functional visual disability. .
       The aural style sheet proposal focuses on the production of sound
       when rendering text by text to speech engine. While this is a very
       important area of focus we feel that this is not enough to address
       the challenges of confronting voice based browser challenges.
       Dealing only with output the current offering falls short of real
       communication between the users and the browser. Users who cannot
       see the text offered to them on the screen must be able to control
       the manner by which content is rendered, select links and navigate
       between pages and provide input such as order entry etc.
       While it is possible to create a voice browser today, using the
       existing HTML definitions, the result will be less than optimal as
       the browser designer must make many assumptions, leaving no
       control to the author of the application, or forcing the author to
       learn another, proprietary configuration language.
       OThe majority of our proposal may be implemented as additions to
       the aural style sheet as well as additional attributes to existing
       tags. No new tags are proposed. Those additions allow the author
       to control the manner by which users who cannot have visual access
       to the content displayed by the browser may fully interact with
       it. In addition we propose the usage of media specific content
       alternatives, or in more simple terms, the ability to provide a
       voice file as an alternative to synthesized speech and the ability
       to provide keyboard input definitions or spoken phrases as an
       alternative to mouse selections.
     Summary of requirements
         1. ProposeDefine user control over the rendering process
       When the browser renders tags it is important to define the level
       of control the user will have over the process.
       Users who frequently use the same WEB site normally know all that
       is going to be said in the beginning of the page. This information
       normally includes a welcome information and some instructions on
       how to use the site. Invariably, users will want at some point to
       skip those objects. The author of the application, on the other
       hand, may want to limit this capability and force the user to
       listen to anrecommend to the user to listen to the entire message.
       The need to do this is in case there are new instructions,
       promotions etc. We need a way for the author to specify whether to
       allow the user to go to a next tag..
     Alternate media 
       While text to speech technology is improving in quality over time,
       the general listener experience still leaves much to be desired.
       An alternative media representation such as a voice file is
       necessary if professional quality audio is to be included in the
       HTML page.
       Users want to use something other then the mouse to navigate. It
       is impossible for people who are blind, for example, to see what
       the mouse point to and select it. Rather they would prefer to
       touch a key or say a word. This feature requires an HTML way of
       specifying the input associated with each link. The input
       specified may be a keyboard key or a phrase to be recognized by
       speech recognition engines.
       When speech recognition is used the author may need some over the
       speech recognition parameters. This control includes pointers to
       vocabulary, definition of sensitivity and more.
     Forms and Input fields 
       We need a set of rules to define the way a browser does form
       rendering. We need a way to define how input for input field is
       solicited from users.
       Graphical browsers place input fields in front of the users. The
       user is compelled to provide the input by the very appearance of
       the field on the display. Most people already know how to fill the
       input fields - type in text, check a check box or select from a
       selection list. In the case of voice browsers, the input or
       navigation controls must be offered to the user.
       Users of graphical browsers use the keyboard to provide input and
       use the mouse to make selections. Voice browsers will have to
       offer the user selection method through the keyboard as well as
       selection through speech.
       WEB authors must be able to specify what spoken phrases should be
       used for the selection of links, radio buttons, check boxes, image
       buttons, submit buttons, and selection lists. (Key access is
       already provided by the accesskey attribute.
     Error response
       When using graphical browsers users select links and input objects
       with the mouse. There can be no selection error. The object is
       selected by clicking on it.
       In a voice based browser it is easy for the user to enter
       unexpected input or just stay there and enter no input at all. For
       example, the browser may offer the selection of one of many anchor
       tags using the keyboard, assigning a key to each anchor. Pressing
       an unassigned key will be considered an error.
       Authors must have control over the browser response to selection
       errors and timeouts.
     Proposal for additional style sheet fields
         1. Define user control over the rendering process
              1. 'InterruptSpeech'
   Value: <String>| Yes|No
   Initial: YES
   Applies to: All elements
   Inherited: Yes
   InterruptSpeech controls the user's freedominform the user agent to
   recommend that the user will not barge in and interrupt a message with
   a voice or a keyboard selection. in stopping the speech rendering
   process and move to the next input element.
   Possible values to this attribute:
     * Yes
     * No
   Browsers are free to define their behavior for InterruptSpeech. If
   enabled, both keyboard and voice may be used to interrupt the speech.
   <STYLE TYPE="Text/css">
   P {InterruptSpeech : No}
         1. Offering Anchors and other input tags
   When relying on text to speech engines rather then on pre-recorded
   voice files, the offering of anchors and other input tags may be done
   using the text associated with the anchor or with input tag (text may
   be associated with input tags using the <label> tag).
   For example.
   <A href=driving.htm> Driving instruction </A>
   May be offered by the voice browser using the following words:
   "for driving instructions press D1"
   The example shows how the phrases "For" and "Press 1" were added to
   the text embedded in the anchor tag.
   On first glance it looks as if this 'wrapper' text should be left for
   the voice browser, but on further examination one can find problems
   with this approach. For example, how will you offer the following
   anchor tag?
   <A href=LeaveMessage.htm> Leave us a message </A>
   Speaking EnglishIn the English language you would rather say "To leave
   us a message, press M1"
   One may correctly assume that foreign languages will have even more
   structures and special words whichwords, which apply to special cases.
   There are several options to implement this feature. One is to assign
   it a property in a style sheet. This may be a good idea because of the
   way cascading style sheets effect entire documents from one place. It
   is assumed in most cases one wrapper string will be used while a small
   number of offering will sound different.
   Another implementation idea is similar to the image map. In the case
   of image map, the same mapping scheme may be applied to many maps in
   the document. Assuming the browser may use its own default for most
   cases, the author of the document may point each one of the small
   number of offerings requiring a special link to a central location
   with a different wrapper string definition.
    1. Additional Attributes For Other HTML tags
         1. Alternative content source
              1. VoiceFile
       Value: <URL>
       Applies to: all elements
       A voice file is an alternative source of content for the tag. For
       example, a text paragraph may be rendered using a recording.
       The value of VoiceFile is a URL pointing to the voice file.
       A voice file may be used as an alternative to various elements
       (e.g. an input field name, a table header (the <th> tag), a table
       data <td>, etc.).
     Speech Recognition GrammarNavigation
         1. 'GRAMMERSelect'
       Value: <ascii>?<string>
       Applies to: All input related tags (<A>, <input> of all types
       The Select attribute allows a string to be used with the help of
       speech recognition software for the selection of the input.
       The Select string applies to anchors and input tags (of type
       checkbox and radio buttons.)The GRAMMER attribute allows the
       inclusion of a grammar block with an input tag. The grammar block
       allows a speech recognition engine to analyze different type of
       speech in a better way. At the present, the proposal does not
       include the format of the block. This will have to be done in
       coordination with the speech recognition industry.
       For clarity purposes an example for the necessity of the GRAMMAR
       attribute is provided here.
       An HTML page may include a check box. The title of the check box
       may be Are you an American Citizen.
       A voice based user agent may ask the user, with the help of a text
       to speech engine, "Are you an American Citizen"
       The possible answers may be "Yes" or "No" but it could also be any
       other word used for negative or positive respond in the callers
       language. It could be "Ya," "you batchya," "sure," "of course" and
       many other expressions. It is necessary to feed to the speech
       recognition engine with many possibilities representing the
       desired response.
    2. Events
         1. Error response
       The action of error response is defined as an event in association
       with a body of input or selection which includes the place where
       the input is solicited from the user by means of talking to the
       user and the place where the browser waits for the input.
       Two types of error response are proposed. An error for a situation
       where no selection is made or no input is entered and an error for
       a case where a selection is made for something which is not
         1. OnSelectionTimeout
            The browser may generate OnSelectionTimeout event when the
            user is asked to provide input of any kind such as a
            selection from a list of anchors or an text input box and
            fails to do so.
            For example, the following block may be offered the user for
            <P onselecttimeout="browser.speakout ("You have not entered
            any selection, please enter your selection now")">
            <A href=Instructions.htm> Directions </A>
            <A href=Todo> List of things to do </A>
            The OnSelectionTimeout onselecttimeot event is processed by
            the browser according to the browser own definition of
            timeout for input entry or selection of anchor tags.
            The OnSelectTimeout event applies to all block tags as well
            as form elements.
         2. OnSelectionError
   When the user selects an option not offered by the browser the user
   must be notified that an error occurred. The notification and the
   resulting action is to be performed by a script associated with
   OnSelectionError event.
   <P onselectionerror="browser.speakout ("The selection you have entered
   is invalid, please enter your selection again now")">
   <A href=Instructions.htm> Directions </A>
   <A href=Todo> List of things to do </A>
Received on Tuesday, 21 October 1997 20:16:10 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 13 October 2015 16:20:59 UTC