Re: collated text definition from Ian Jacobs on 2000-03-08 (w3c-wai-ua@w3.org from January to March 2000)

From: Ian Jacobs <ij@w3.org>
Date: Wed, 08 Mar 2000 18:10:35 -0500
To: Marja-Riitta Koivunen <marja@w3.org>
CC: w3c-wai-ua@w3.org, ehansen@ets.org
Message-ID: <38C6DDEB.ED1E478@w3.org>

Marja-Riitta Koivunen wrote:
> 
> Below is a definition of collated text from the WCAG and it does not say
> anything about synchronization. I always thought it is just text without
> timestamps that can be read separately in your own pace and get an idea of
> the content of the video. To my understanding the synchronization makes
> sense only if you don't see the video (because the additional text explains
> what happens at the video). And then if you don't see it you don't see text
> either and you eventually want it with audio (or maybe braille?), which
> makes the collated text to actually be audio description combined with the
> original audio of the presentation. And that is actually what the SMIL
> presentations I looked at usually had: synchronized captions and audio
> description that could be combined with the original video according to
> user preferences.
> 
> Did I misunderstood something?

One option in addition to Braille is that the full collated
text transcript may be used as the basis of an auditory equivalent.
You may argue that the sychronization cues would have to be modified
and we're back to that old discussion.

Eric, do you have any insight?

 - Ian
    
> WCAG definition:
> A text transcript is a text equivalent of audio information that includes
> spoken words and non-spoken sounds such as sound effects. A caption is a
> text transcript for the audio track of a video presentation that is
> synchronized with the video and audio tracks. Captions are generally
> rendered visually by being superimposed over the video, which benefits
> people who are deaf and hard-of-hearing, and anyone who cannot hear the
> audio (e.g., when in a crowded room). A collated text transcript combines
> (collates) captions with text descriptions of video information
> (descriptions of the actions, body language, graphics, and scene changes of
> the video track). These text equivalents make presentations accessible to
> people who are deaf-blind and to people who cannot play movies, animations,
> etc. It also makes the information available to search engines.
> One example of a non-text equivalent is an auditory description of the key
> visual elements of a presentation. The description is either a prerecorded
> human voice or a synthesized voice (recorded or generated on the fly). The
> auditory description is synchronized with the audio track of the
> presentation, usually during natural pauses in the audio track. Auditory
> descriptions include information about actions, body language, graphics,
> and scene changes.
> 
> Marja

-- 
Ian Jacobs (jacobs@w3.org)   http://www.w3.org/People/Jacobs
Cell:                        +1 917 450-8783

Received on Wednesday, 8 March 2000 18:10:47 UTC