- From: Al Gilman <asgilman@access.digex.net>
- Date: Wed, 27 Aug 1997 09:41:56 -0400 (EDT)
- To: w3c-wai-wg@w3.org (WAI Working Group)
to follow up on what David Pawson said: > > > > [David Pawson] Are we approaching a 'channelling' effect? > The impact of personal choice would leave a user instructing the > browser to selectively action visual and auditory output, [Al Gilman] Yes, you have got the basic idea very well. User control over how streams of information from the source get directed to sensory channels of the user. Text-to-speech gives us some crossover capability. The web page author thinks of text as destined for the user's eyes. But the eyes-free user redirects the text to his/her ears. If there is already an audio track targeted to the audio sensation, there is contention. In the case of a movie description done at NCAM, you can mix it with the sound track because it is synchronously designed and edited to be overlayed in that way. On the web, the sound effects are designed asynchronously, the collisions are less benign, and the user will have to exercise more choices concerning whether to mix the sound streams or break them apart, muting one or another of them at times. [Snip] [David Pawson] > Channels would need to be defined for > Primary output visual > Primary output audio > [One of these may be defined as my preferred prime channel] > Secondary output visual > Secondary output audio [Al Gilman] Because we have some ability to shift content between user-sensation channels in the user equipment, the content providers don't have to provide separate data for all the profiles of user capability and preference that will be served. There is not an end-to-end set of parallel channels. The information flow has some redirection and mixing capability on the user side. A combination of some redundancy (such as by transcripts and captions, which shadow sound in text) in the data bundle offered by the source, together with user control over how the source-provided streams or components are presented, gives us the maximum adaptability for the minimum cost. [David Pawson] > if we wanted to get exotic, the presence of a secondary channel > output could lead to an event to which I may wish to respond, by > halting the > main channel output to listen, look at the secondary channel? > [Al Gilman] Yes, I was imagining something that exotic. Consider a slide-show presentation with a continuous audio track and a sequence of still images. On can imagine the blind user playing the audio track in near real time. They could skim along with just the titles of the slides automatically spliced into the audio at the points where the slide changes. Then, when the voice track doesn't convey a complete story, the user could stop the playback, reset the play mode, and have it read the text on the slides and possibly an audio or textual description of the slide before proceeding with each frame's-worth of the sound track. To be realistic, I think we have to talk separately about the audio, visual, and tactile channels by which the information finally gets to the user a little separately from the media types that carry the information from the Web server to the Web client. HTML text with CSS styling is a media type that lives in the HTTP dialog that has dual capability to be presented in sight or sound. With an ACSS style in the library, the sound can be even better. But other content, like GIF files, is not that flexible. For these we have to build in separate data [the description] to make the message accessible in sound. Sometimes alternate presentation of portions of the information will be prepared at the source, and sometimes they will be generated at the user. The author will not in general have thought through all the combinations and conflicts that can arise, so the system has to reserve some control to the user. -- Al Gilman
Received on Wednesday, 27 August 1997 09:41:58 UTC