- From: (unknown charset) Al Gilman <asgilman@access.digex.net>
- Date: Tue, 21 Oct 1997 20:15:52 -0400 (EDT)
- To: (unknown charset) w3c-wai-ig@w3.org
Or Ben-Natan just filed this on the HC list. Give it a look if you can respond by tomorrow. -- Al Gilman http://www.access.digex.net/%7Easgilman/web-access/ACSS/MSExtensions.htm ------------------------------------------------------------------------------- Microsoft Corporation Proposal For Aural HTML Extensions Prepared By : Or Ben-Natan Create Date : 6/6/97 Status : Draft Version : 0.9 Filename : AuralHTMLProposal.doc Copyright © by Microsoft Corporation All Rights Reserved Amendments Version Author Date Change 0.9 Or Ben-Natan 6/6/97 Initial version 1 Overview * 2 Summary of requirements * 2.1 Propose user control over the rendering process * 2.2 Alternate media * 2.3 Navigation * 2.4 Forms and Input fields * 2.5 Error response * 3 Proposal for additional style sheet fields * 3.1 Define user control over the rendering process * 3.1.1 'InterruptSpeech' * 3.2 Offering Anchors and other input tags * 4 Additional Attributes For Other HTML tags * 4.1 Alternative content source * 4.1.1 VoiceFile * 4.2 Speech Recognition Grammar * 4.2.1 'GRAMMER' * 5 Events * 5.1 Error response * 5.1.1 OnSelectionTimeout * 5.1.2 OnSelectionError * 1 Overview 2 2 Summary of requirements 2 2.1 Define user control over the rendering process 2 2.2 Navigation 2 2.3 Forms and Input fields 2 2.4 Error response 3 3 Proposal for additional style sheet fields 3 3.1 Define user control over the rendering process 3 3.1.1 'InterruptSpeech' 3 3.2 Offering Anchors and other input tags 3 4 Additional Attributes For Other HTML tags 4 4.1 Alternative content source 4 4.1.1 VoiceFile 4 4.2 Navigation 4 4.2.1 'Select' 4 5 Events 4 5.1 Error response 4 5.1.1 OnSelectionTimeout 5 5.1.2 OnSelectionError 5 1 Overview 2 2 Summary of requirements 2 2.1 Define user control over the rendering process 2 2.2 Navigation 2 2.3 Forms and Input fields 2 2.4 Error response 3 3 Proposal for additional style sheet fields 3 3.1 Define user control over the rendering process 3 3.1.1 'InterruptSpeech' 3 3.2 Offering Anchors and other input tags 3 4 Additional Attributes For Other HTML tags 4 4.1 Alternative content source 4 4.1.1 VoiceFile 4 4.2 Navigation 4 4.2.1 'Select' 4 5 Events 4 5.1 Error response 4 5.1.1 OnSelectionTimeout 5 5.1.2 OnSelectionError 5 1. Overview This document brings forward the Microsoft comments on the aural style sheet, http://www.w3.org/Style/css/Speech/NOTE-ACSS together with some other suggestions designed to improve the accessibility of HTML to people with inherent of functional visual disability. . The aural style sheet proposal focuses on the production of sound when rendering text by text to speech engine. While this is a very important area of focus we feel that this is not enough to address the challenges of confronting voice based browser challenges. Dealing only with output the current offering falls short of real communication between the users and the browser. Users who cannot see the text offered to them on the screen must be able to control the manner by which content is rendered, select links and navigate between pages and provide input such as order entry etc. While it is possible to create a voice browser today, using the existing HTML definitions, the result will be less than optimal as the browser designer must make many assumptions, leaving no control to the author of the application, or forcing the author to learn another, proprietary configuration language. OThe majority of our proposal may be implemented as additions to the aural style sheet as well as additional attributes to existing tags. No new tags are proposed. Those additions allow the author to control the manner by which users who cannot have visual access to the content displayed by the browser may fully interact with it. In addition we propose the usage of media specific content alternatives, or in more simple terms, the ability to provide a voice file as an alternative to synthesized speech and the ability to provide keyboard input definitions or spoken phrases as an alternative to mouse selections. Summary of requirements 1. ProposeDefine user control over the rendering process When the browser renders tags it is important to define the level of control the user will have over the process. Users who frequently use the same WEB site normally know all that is going to be said in the beginning of the page. This information normally includes a welcome information and some instructions on how to use the site. Invariably, users will want at some point to skip those objects. The author of the application, on the other hand, may want to limit this capability and force the user to listen to anrecommend to the user to listen to the entire message. The need to do this is in case there are new instructions, promotions etc. We need a way for the author to specify whether to allow the user to go to a next tag.. Alternate media While text to speech technology is improving in quality over time, the general listener experience still leaves much to be desired. An alternative media representation such as a voice file is necessary if professional quality audio is to be included in the HTML page. Navigation Users want to use something other then the mouse to navigate. It is impossible for people who are blind, for example, to see what the mouse point to and select it. Rather they would prefer to touch a key or say a word. This feature requires an HTML way of specifying the input associated with each link. The input specified may be a keyboard key or a phrase to be recognized by speech recognition engines. When speech recognition is used the author may need some over the speech recognition parameters. This control includes pointers to vocabulary, definition of sensitivity and more. Forms and Input fields We need a set of rules to define the way a browser does form rendering. We need a way to define how input for input field is solicited from users. Graphical browsers place input fields in front of the users. The user is compelled to provide the input by the very appearance of the field on the display. Most people already know how to fill the input fields - type in text, check a check box or select from a selection list. In the case of voice browsers, the input or navigation controls must be offered to the user. Users of graphical browsers use the keyboard to provide input and use the mouse to make selections. Voice browsers will have to offer the user selection method through the keyboard as well as selection through speech. WEB authors must be able to specify what spoken phrases should be used for the selection of links, radio buttons, check boxes, image buttons, submit buttons, and selection lists. (Key access is already provided by the accesskey attribute. Error response When using graphical browsers users select links and input objects with the mouse. There can be no selection error. The object is selected by clicking on it. In a voice based browser it is easy for the user to enter unexpected input or just stay there and enter no input at all. For example, the browser may offer the selection of one of many anchor tags using the keyboard, assigning a key to each anchor. Pressing an unassigned key will be considered an error. Authors must have control over the browser response to selection errors and timeouts. Proposal for additional style sheet fields 1. Define user control over the rendering process 1. 'InterruptSpeech' Value: <String>| Yes|No Initial: YES Applies to: All elements Inherited: Yes InterruptSpeech controls the user's freedominform the user agent to recommend that the user will not barge in and interrupt a message with a voice or a keyboard selection. in stopping the speech rendering process and move to the next input element. Possible values to this attribute: * Yes * No Browsers are free to define their behavior for InterruptSpeech. If enabled, both keyboard and voice may be used to interrupt the speech. Example: <STYLE TYPE="Text/css"> P {InterruptSpeech : No} </STYLE> 1. Offering Anchors and other input tags When relying on text to speech engines rather then on pre-recorded voice files, the offering of anchors and other input tags may be done using the text associated with the anchor or with input tag (text may be associated with input tags using the <label> tag). For example. <A href=driving.htm> Driving instruction </A> May be offered by the voice browser using the following words: "for driving instructions press D1" The example shows how the phrases "For" and "Press 1" were added to the text embedded in the anchor tag. On first glance it looks as if this 'wrapper' text should be left for the voice browser, but on further examination one can find problems with this approach. For example, how will you offer the following anchor tag? <A href=LeaveMessage.htm> Leave us a message </A> Speaking EnglishIn the English language you would rather say "To leave us a message, press M1" One may correctly assume that foreign languages will have even more structures and special words whichwords, which apply to special cases. There are several options to implement this feature. One is to assign it a property in a style sheet. This may be a good idea because of the way cascading style sheets effect entire documents from one place. It is assumed in most cases one wrapper string will be used while a small number of offering will sound different. Another implementation idea is similar to the image map. In the case of image map, the same mapping scheme may be applied to many maps in the document. Assuming the browser may use its own default for most cases, the author of the document may point each one of the small number of offerings requiring a special link to a central location with a different wrapper string definition. 1. Additional Attributes For Other HTML tags 1. Alternative content source 1. VoiceFile Value: <URL> Applies to: all elements A voice file is an alternative source of content for the tag. For example, a text paragraph may be rendered using a recording. The value of VoiceFile is a URL pointing to the voice file. A voice file may be used as an alternative to various elements (e.g. an input field name, a table header (the <th> tag), a table data <td>, etc.). Speech Recognition GrammarNavigation 1. 'GRAMMERSelect' Value: <ascii>?<string> Applies to: All input related tags (<A>, <input> of all types etc.) The Select attribute allows a string to be used with the help of speech recognition software for the selection of the input. The Select string applies to anchors and input tags (of type checkbox and radio buttons.)The GRAMMER attribute allows the inclusion of a grammar block with an input tag. The grammar block allows a speech recognition engine to analyze different type of speech in a better way. At the present, the proposal does not include the format of the block. This will have to be done in coordination with the speech recognition industry. For clarity purposes an example for the necessity of the GRAMMAR attribute is provided here. An HTML page may include a check box. The title of the check box may be Are you an American Citizen. A voice based user agent may ask the user, with the help of a text to speech engine, "Are you an American Citizen" The possible answers may be "Yes" or "No" but it could also be any other word used for negative or positive respond in the callers language. It could be "Ya," "you batchya," "sure," "of course" and many other expressions. It is necessary to feed to the speech recognition engine with many possibilities representing the desired response. 2. Events 1. Error response The action of error response is defined as an event in association with a body of input or selection which includes the place where the input is solicited from the user by means of talking to the user and the place where the browser waits for the input. Two types of error response are proposed. An error for a situation where no selection is made or no input is entered and an error for a case where a selection is made for something which is not offered. 1. OnSelectionTimeout The browser may generate OnSelectionTimeout event when the user is asked to provide input of any kind such as a selection from a list of anchors or an text input box and fails to do so. For example, the following block may be offered the user for navigation. <P onselecttimeout="browser.speakout ("You have not entered any selection, please enter your selection now")"> <A href=Instructions.htm> Directions </A> <A href=Todo> List of things to do </A> </P> The OnSelectionTimeout onselecttimeot event is processed by the browser according to the browser own definition of timeout for input entry or selection of anchor tags. The OnSelectTimeout event applies to all block tags as well as form elements. 2. OnSelectionError When the user selects an option not offered by the browser the user must be notified that an error occurred. The notification and the resulting action is to be performed by a script associated with OnSelectionError event. Example: <P onselectionerror="browser.speakout ("The selection you have entered is invalid, please enter your selection again now")"> <A href=Instructions.htm> Directions </A> <A href=Todo> List of things to do </A> </P>
Received on Tuesday, 21 October 1997 20:16:10 UTC