- From: Jim Allan <jimallan@tsbvi.edu>
- Date: Wed, 3 Feb 2010 12:35:06 -0600
- To: "'UAWG list'" <w3c-wai-ua@w3.org>
My thoughts and musings... Reviewed the comment at http://lists.w3.org/Archives/Public/public-uaag2-comments/2009Sep/0000.html. Text is restructured below to present meaningful topics for discussion. The original comment was a response to a request for comments on our working draft. "Are the synthesized speech configuration success criteria in Guideline 3.8 clear and provide adequate instruction to user agent developers?" Respondents proposed the following additions to UAAG: 1. the ability for the User Agent to switch the speech synthesizer language : - automatically, based on the lang attribute in the content being read - or manually, by providing controls in the User Agent ( as opposed to externally, to the OS speech synthesizer) Response: Switching speech synthesizer language base on the @lang is current behavior for commercially available screen readers. Switching occurs in the assistive technology. Where the screen reader is part of the OS (i.e. Voice Over on the Macintosh), the language switch occurs in Voice Over. Testing with a transcoding site [1] (a site that transforms the current webpage or browsing session to meet the needs of the user without the use of assistive technology) language switching did not occur. Transcoding sites are still relatively new and not fully formed. Switching language is not currently in UAAG2. It should be added to GL 3.8 <proposed> 3.8.a The user can set the default language of the speech synthesizer. (A) 3.8.b The speech synthesizer must switch languages as appropriate when encountering an author indication that content being read is a different language. (A) </proposed> The above is a bit wordy. Was trying to stay away from HTML specific @lang. there is also the problem of the speech synthesizer only having one language and not being able to switch, as is the case with VoiceOver; which only comes with English built-in. Also, the Mac OS does not auto switch to third party voices (other languages installed by the user). Also found 5.3.x Appropriate Language. If characteristics of your user agent involve producing an end user experience such as speech, you need to react appropriately to language changes. This seems to be misplaced. Would be a good substitute for 3.8.a 2. the ability to change the synthesizer voice (when a choice of voices is available) Response: currently voice switching happens in the screen reader. The user can select the default voice for all text spoken. Additionally, voice changes can occur dynamically depending on the type of element or attribute on the webpage (i.e. different voice or pitch for heading levels, bold, links, etc.). In obsolete self-voicing browsers (i.e. pwWebSpeak, IBM Home Page Reader) this feature was also available. This is covered by 3.8.2 and 3.8.3 3. the reading mode (words, spelling) that is used by the User Agent in different places. The User Agent can setup specific triggers (class, id) to switch between various voices and modes of speech. Response: This is partially covered by 3.8.3 and 3.8.5. On considering...many of these seem to be screen reader behaviours, not synthesizer behaviours. The screen reader (or self-voicing browser) tells the synthesizer what to say. User move caret by character, a character is spoken. User moves caret by word, the word is spoken. The same is true for line, paragraph, etc. The screen reader sends the appropriate information to the synthesizer for sound production. It is also the screen-reader/browser that determine the voicing of <abbr>, <acronym>, etc. based on user setting. Changing speech characteristics based on class, id, style attribute, etc. seems more of a task for an author to write specific speech behaviours for these attributes. The problem is one of scale...there are a finite number of elements with which to trigger different speech behaviours. The number of @id, @class, are infinite. Additional comments: 4. It seems that the current UAAG guidelines are not simple to implement in User Agents with existing speech synthesizers APIs. It might help to have a technical review (state of the art) of existing Speech Synthesizers APIs and use these as a practical basis for implementations in User Agents (what fields and controls should be exposed). Response: The guidelines in the current document were developed with input from developers familiar with speech synthesis API. The working group has solicited review of the document by screen reader developers and manufacturers. However, we have discussed problems with exception (or acronym an abbreviation expansion) dictionaries. Synthesizers have these dictionaries, they are unique to the synthesizer. The documentation of these dictionaries is not always readily available. Superimposed on the synthesizer is the screen reader exception dictionaries (also unique). If the user agent imposes control (on/off or other exceptions) the permutations of what the average user would have to configure is a bit daunting. 5. Often, one of the missing features in speech synthesizers implementations is the ability to query the state and progress of the speech being synthesized. The User Agent could fulfill this role. Querying what is the speech synthesizer processing with regards to the UA yields important feedback for the user such as: "the speech synthesizer is currently reading div id="sasl" paragraph 5 word number 10, there are still 500 words to read, it has been reading for 15 seconds and ETA is 2 min, etc ..." Response: again, this seem more of a screen reader behavior. Commercially available screen readers have a where-am-I function to give current speech caret position on the page. They do not AFIK present the more elaborate @id, time to end of reading based on reading rate, etc. Seems beyond AAA. Recommend non-inclusion in UAAG20 References: 1. http://webanywhere.cs.washington.edu/wa.php Jim Allan, Accessibility Coordinator & Webmaster Texas School for the Blind and Visually Impaired 1100 W. 45th St., Austin, Texas 78756 voice 512.206.9315 fax: 512.206.9264 http://www.tsbvi.edu/ "We shape our tools and thereafter our tools shape us." McLuhan, 1964
Received on Wednesday, 3 February 2010 18:35:54 UTC