W3C home > Mailing lists > Public > w3c-wai-ua@w3.org > January to March 2000

collated text definition

From: Marja-Riitta Koivunen <marja@w3.org>
Date: Thu, 02 Mar 2000 16:13:20 -0500
Message-Id: <>
To: w3c-wai-ua@w3.org
Below is a definition of collated text from the WCAG and it does not say
anything about synchronization. I always thought it is just text without
timestamps that can be read separately in your own pace and get an idea of
the content of the video. To my understanding the synchronization makes
sense only if you don't see the video (because the additional text explains
what happens at the video). And then if you don't see it you don't see text
either and you eventually want it with audio (or maybe braille?), which
makes the collated text to actually be audio description combined with the
original audio of the presentation. And that is actually what the SMIL
presentations I looked at usually had: synchronized captions and audio
description that could be combined with the original video according to
user preferences.

Did I misunderstood something?

WCAG definition:
A text transcript is a text equivalent of audio information that includes
spoken words and non-spoken sounds such as sound effects. A caption is a
text transcript for the audio track of a video presentation that is
synchronized with the video and audio tracks. Captions are generally
rendered visually by being superimposed over the video, which benefits
people who are deaf and hard-of-hearing, and anyone who cannot hear the
audio (e.g., when in a crowded room). A collated text transcript combines
(collates) captions with text descriptions of video information
(descriptions of the actions, body language, graphics, and scene changes of
the video track). These text equivalents make presentations accessible to
people who are deaf-blind and to people who cannot play movies, animations,
etc. It also makes the information available to search engines. 
One example of a non-text equivalent is an auditory description of the key
visual elements of a presentation. The description is either a prerecorded
human voice or a synthesized voice (recorded or generated on the fly). The
auditory description is synchronized with the audio track of the
presentation, usually during natural pauses in the audio track. Auditory
descriptions include information about actions, body language, graphics,
and scene changes. 

Received on Thursday, 2 March 2000 16:17:52 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:38:25 UTC