- From: Al Gilman <asgilman@access.digex.net>
- Date: Tue, 26 Aug 1997 09:33:58 -0400 (EDT)
- To: w3c-wai-wg@w3.org (WAI Working Group)
to follow up on what Geoff Freed said: > > Al's correct. I should have qualified the "automatically." > However, is it necessary for there to be a spoken description of > a sound effect caption? I'll illustrate with yet another > parallel to broadcast captioning and audio description: if a > program contains both closed captions and audio descriptions, (as > some do on PBS and home video), the description track does not > reflect the fact that captions are being displayed on the > television screen. In other words, the captions are not read as > part of the descriptions. Conversely, the audio descriptions are > not reflected in the closed captions. > > Now, apply this to the Web. If I were a deaf Web user, sound > effects would be described to me visually, using a caption (or > something like it). And if I were a blind user, I wouldn't need > a sound effect described aurally to me because I could already > hear it. Thus, you'd only need a sound effect *caption*, not a > description. That would eliminate the problem Al describes > below. Yes? > > [referring to...] > > For those using synthetic speech to access text, there are > > potential problems when the sound effect, and/or the spoken text > > of a description of the sound effect, collides (in the audio > > delivered to the user) with the presentation of spoken text > > extracted from the page. > Can't say as how I anticipated a _description of the caption_. What I was talking about was that when there is a description of the sound effect, as there is sometimes a description of an image, that some users would want to use text-to-speech to read the description of the effect. Just as visually impaired (but not blind) users may wish to access both an image and its description I would suspect that there will be users with auditory impairment but who are not deaf would be in a grey zone where they would want to access sounds and/or descriptions of sounds with sound-by-sound navigation choice as opposed to a longstanding preference choice. Those that are at the same time blind would, I suppose, access the text description by text-to-speech techniques. A perhaps more important consideration is the fact that on the Web, as opposed to transcribing broadcast content, there are sounds attached to text and graphics which are programmed to play asynchronously on mouse events, such as when the mouse cursor enters the graphic region used to present certain text. This is where I see the major source of destructive interference between programmed sounds and sounds created by the text-to-speech transcription process. First, the mouse point and the reading point are loosely connected and things happening automatically on mouse motion without a button press may be confusing, and secondly even if the sound effect does apply to the text currently being read it may obscure the audibilty of the synthesized speech. The overlaying of these two sounds is not what the author designed in, and the user will need to be able to fix it when it interferes. I think that that is where Geoff was agreeing with me that on-event sounds should be convertible to on-selection sounds if the user needs this additional control to keep the sound effects from trampling on the reading process. More generally, the notion of mouse events is too physical, too tied to the unique characteristics of the GUI, to be universal HTML which ports gracefully into non-visual browse modes. In a non-visual browse, text has no layout coordinates bound to it. It is just text that falls within some part of the document pursuant to the document structure encoded in the markup. The cursor location, as a point in a graphic canvas, doesn't exist. So events detected by monitoring the cursor location don't exist, and the control conditions for starting the sound are not defined. -- Al Gilman
Received on Tuesday, 26 August 1997 09:34:02 UTC