- From: Greg Lowney <gcl-0039@access-research.org>
- Date: Wed, 03 Feb 2010 21:01:40 -0800
- To: WAI-UA list <w3c-wai-ua@w3.org>
Some thoughts no David's and Jim's suggestions re ACTION-268: 0. I agree with David that if we're address speech output we should also address refreshable braille output. 1.1. A problem with the proposed wording of 3.8.b is that it fails to exempt UA when the synthesizer does not support the specified language. This is especially problematic if it's Level A. It needs to be modified to address that. 1.2 A second problem with the proposed 3.8.b is that, since it gives author-specified language ultimate authority, any content which incorrectly identifies its language would be inaccessible to the speech-output user. As with the other SC in 3.8, the user should have the ability to override any author-specified settings, including language. 1.3 Jim mentions that 5.3.x might be misplaced. To me the meaning of that SC is entirely unclear from its current wording. Does anyone else know what it's trying to say? 2. Re the suggestion that the user can change speech synthesizer voices, I disagree with the proposed response saying this is the responsibility of the screen reader, as all of Guideline 3.8 is geared towards user agents that self-voice. While self-voicing browsers may be out of style on the PC, they are certainly still the norm for telephone access to Web and Mail. If we think it important that the user of self-voicing UA can change speed and pitch (3.8.1) it makes sense that they can change other speech synthesizer attributes such as which voice profile is being used (e.g. Whispering Wendy vs. Doctor Dennis). 3.1 As Jim points out, it's completely reasonable for the self-voicing user agent to read a word at a time when the user navigates by words, a letter at a time when they navigate by characters, etc. However, when the user asks the browser to read a passage separate from navigation, it makes sense for them to be able to specify whether they want it read as letters, words, sentences, etc. 3.2 The suggestion that the user can program the UA with specific reading behaviors for any attributes is asking a lot, and probably too much. (I have to assume, too, that the original commenter meant that to be associated with more than just character vs. word reading mode.) Setting that up would take a lot of work for each site. One could do this using the Greasemonkey extension to Firefox, but I don't know that it would be implemented widely enough to warrant making it a low-priority SC, and it's certainly too hard to make high-priority. 5. I don't understand how the UA could fulfill the user's suggestion without changes to the underlying speech synthesizers. However, I disagree with the response that says it's only an issue for assistive technology, since any such issues would equally apply to self-voicing user agents. Thanks, Greg -------- Original Message -------- Subject: Re: ACTION-268 - Craft request for input on synthesized speech inclusion in the document From: David Poehlman <poehlman1@comcast.net> To: jimallan@tsbvi.edu Cc: "'UAWG list'" <w3c-wai-ua@w3.org> Date: 2/3/2010 10:53 AM I think if we are going to do up speech, we also need to do up refreshable braille. For instance, it is possible that language may not be expressed in speech, but expressed in braille. Often, even if the screen reader changes the speech language, braille stays the same as far as I know. Lastly, it is deffinitely the at that decides how to handle language changes. The at in the case of voiceover is built into the ui and not part of the ua. On Feb 3, 2010, at 1:35 PM, Jim Allan wrote: My thoughts and musings... Reviewed the comment at http://lists.w3.org/Archives/Public/public-uaag2-comments/2009Sep/0000.html. Text is restructured below to present meaningful topics for discussion. The original comment was a response to a request for comments on our working draft. "Are the synthesized speech configuration success criteria in Guideline 3.8 clear and provide adequate instruction to user agent developers?" Respondents proposed the following additions to UAAG: 1. the ability for the User Agent to switch the speech synthesizer language : - automatically, based on the lang attribute in the content being read - or manually, by providing controls in the User Agent ( as opposed to externally, to the OS speech synthesizer) Response: Switching speech synthesizer language base on the @lang is current behavior for commercially available screen readers. Switching occurs in the assistive technology. Where the screen reader is part of the OS (i.e. Voice Over on the Macintosh), the language switch occurs in Voice Over. Testing with a transcoding site [1] (a site that transforms the current webpage or browsing session to meet the needs of the user without the use of assistive technology) language switching did not occur. Transcoding sites are still relatively new and not fully formed. Switching language is not currently in UAAG2. It should be added to GL 3.8 <proposed> 3.8.a The user can set the default language of the speech synthesizer. (A) 3.8.b The speech synthesizer must switch languages as appropriate when encountering an author indication that content being read is a different language. (A) </proposed> The above is a bit wordy. Was trying to stay away from HTML specific @lang. there is also the problem of the speech synthesizer only having one language and not being able to switch, as is the case with VoiceOver; which only comes with English built-in. Also, the Mac OS does not auto switch to third party voices (other languages installed by the user). Also found 5.3.x Appropriate Language. If characteristics of your user agent involve producing an end user experience such as speech, you need to react appropriately to language changes. This seems to be misplaced. Would be a good substitute for 3.8.a 2. the ability to change the synthesizer voice (when a choice of voices is available) Response: currently voice switching happens in the screen reader. The user can select the default voice for all text spoken. Additionally, voice changes can occur dynamically depending on the type of element or attribute on the webpage (i.e. different voice or pitch for heading levels, bold, links, etc.). In obsolete self-voicing browsers (i.e. pwWebSpeak, IBM Home Page Reader) this feature was also available. This is covered by 3.8.2 and 3.8.3 3. the reading mode (words, spelling) that is used by the User Agent in different places. The User Agent can setup specific triggers (class, id) to switch between various voices and modes of speech. Response: This is partially covered by 3.8.3 and 3.8.5. On considering...many of these seem to be screen reader behaviours, not synthesizer behaviours. The screen reader (or self-voicing browser) tells the synthesizer what to say. User move caret by character, a character is spoken. User moves caret by word, the word is spoken. The same is true for line, paragraph, etc. The screen reader sends the appropriate information to the synthesizer for sound production. It is also the screen-reader/browser that determine the voicing of <abbr>, <acronym>, etc. based on user setting. Changing speech characteristics based on class, id, style attribute, etc. seems more of a task for an author to write specific speech behaviours for these attributes. The problem is one of scale...there are a finite number of elements with which to trigger different speech behaviours. The number of @id, @class, are infinite. Additional comments: 4. It seems that the current UAAG guidelines are not simple to implement in User Agents with existing speech synthesizers APIs. It might help to have a technical review (state of the art) of existing Speech Synthesizers APIs and use these as a practical basis for implementations in User Agents (what fields and controls should be exposed). Response: The guidelines in the current document were developed with input from developers familiar with speech synthesis API. The working group has solicited review of the document by screen reader developers and manufacturers. However, we have discussed problems with exception (or acronym an abbreviation expansion) dictionaries. Synthesizers have these dictionaries, they are unique to the synthesizer. The documentation of these dictionaries is not always readily available. Superimposed on the synthesizer is the screen reader exception dictionaries (also unique). If the user agent imposes control (on/off or other exceptions) the permutations of what the average user would have to configure is a bit daunting. 5. Often, one of the missing features in speech synthesizers implementations is the ability to query the state and progress of the speech being synthesized. The User Agent could fulfill this role. Querying what is the speech synthesizer processing with regards to the UA yields important feedback for the user such as: "the speech synthesizer is currently reading div id="sasl" paragraph 5 word number 10, there are still 500 words to read, it has been reading for 15 seconds and ETA is 2 min, etc ..." Response: again, this seem more of a screen reader behavior. Commercially available screen readers have a where-am-I function to give current speech caret position on the page. They do not AFIK present the more elaborate @id, time to end of reading based on reading rate, etc. Seems beyond AAA. Recommend non-inclusion in UAAG20 References: 1. http://webanywhere.cs.washington.edu/wa.php Jim Allan, Accessibility Coordinator & Webmaster Texas School for the Blind and Visually Impaired 1100 W. 45th St., Austin, Texas 78756 voice 512.206.9315 fax: 512.206.9264 http://www.tsbvi.edu/ "We shape our tools and thereafter our tools shape us." McLuhan, 1964
Received on Thursday, 4 February 2010 05:03:45 UTC