- From: Charles McCathieNevile <charles@w3.org>
- Date: Wed, 20 Oct 1999 18:37:59 -0400 (EDT)
- To: eric hansen <ehansen@ets.org>
- cc: w3c-wai-au@w3.org, jutta.treviranus@utoronto.ca
Thanks Eric.
I will incorporate changes from this and and some more from your last memo
into the draft tomorrow.
regards
Charles McCN
On Wed, 20 Oct 1999, eric hansen wrote:
Changes in Terms and Definitions
I think that a few changes are needed to make definitions more consistent
with WCAG 1.0. This message is a follow-up to my earlier message
(http://lists.w3.org/Archives/Public/w3c-wai-au/1999OctDec/0043.html)
"//Issue G-19: There is a cluster of terms that seems to be defined
imprecisely or are used and not defined. Among the defined terms that need
refinement are: accessibility information, alternative presentations,
alternative information, accessible, accessibility. Among the undefined
terms are: alternative representations, equivalent information, textual
alternatives, audio descriptions, should be auditory descriptions,
alternative information, accessibility information, descriptive text
equivalents, text equivalent, caption, auditory description, collated text
transcript, transcript, refreshable braille display(?), DTD,. Having so
many terms with partly (or completely) overlapping meanings causes lack of
clarity and usability of the document. These meanings need to be added, if
necessary, and then clarified. Whenever possible this document should use
terms already defined in the WCAG document.//"
I don't think that all these issues cited in that memo have been fully
addressed, but on my latest review, the following two stood out.
1. Video Captions
Problem:
The definition of "video captions" in the Terms section does not seem
accurate. Its emphasis on describing "action" as well as "dialog" seems
quite inaccurate. I have changed the text to correct those problems. I have
also inserted it into the definition of "Alternative Presentations and
Alternative Information" since the other, parallel terms ("collated text
transcript", "auditory description", etc.) do not seem to be broken out
separately.
Old (10/14/99):
Video Captions
A video caption is a textual message that is stored in the text track of a
video file. The video caption describes the action and dialog for the scene
in which it is displayed.
Proposed:
"Captions" are essential text equivalents for movie audio. Captions consist
of a text transcript of the audio track of the movie (or other video
presentation) that is synchronized with the video and audio tracks.
Captions are generally displayed visually and benefit people who can see
but are deaf, hard-of-hearing, or cannot hear the audio (e.g., in a noisy
room).
Note 1: For comparison, see the following from the definition of
"equivalent" in WCAG 1.0.: "A caption is a text transcript for the audio
track of a video presentation that is synchronized with the video and audio
tracks. Captions are generally rendered visually by being superimposed over
the video, which benefits people who are deaf and hard-of-hearing, and
anyone who cannot hear the audio (e.g., when in a crowded room)."
Note 2: If the individual definitions were broken out, it might be
something like the following: "Captions"[:]"Captions consist of a text
transcript of the auditory track of a video presentation (e.g., movie) that
is synchronized with the visual and auditory tracks. Captions are generally
displayed visually and benefit people who can see but are deaf,
hard-of-hearing, or cannot hear the audio (e.g., in a noisy room)."
====
2. Alternative Presentations and Alternative Information
Problem:
The definition of "Alternative Presentations and Alternative Information"
should be clarified to make it more consistent with the WCAG 1.0 document.
The phrase "captions of the audio" seems misleading since captions are
relevant for the auditory track of a movie but not for ordinary audio clips,
for which the term "text transcript" is used. The apparent reference to
human population growth must be fictional; I know of no geographical region
that has had such growth. Hence, I provided a different example that I
think might be more feasible. The reference to "textual representations of
each of these" is inaccurate since some of the examples are already
"textual representations."
Old (10/14/99):
Alternative Presentations and Alternative Information
Certain types of content may not be accessible to all users (e.g., images
or audio presentations), so alternative representations are used, such as
transcripts for audio, or short functionally equivalent text (e.g., "site
map link") and/or descriptive text equivalents (e.g., "Graph 2.5 shows that
the population has doubled approximately every ten years for the last fifty
years, increasing from about 10 million to 330 million in that time"). An
object may have several alternative representations, for example a video,
captions of the audio, audio description of the video, a series of still
images, and textual representations of each of these.
Proposed:
The following version brings the material into closer conformity to the
WCAG 1.0 document.
Alternative Presentations and Alternative Information
Certain types of content may not be accessible to all users, so alternative
representations are used. For example, "text equivalents" are provided for
all non-text elements because text can be flexibly output in ways that are
usable to diverse populations. Specifically, text can be output
_auditorially_ via synthesized speech for individuals who have visual or
learning disabilities, _tactually_ via braille for individuals who are
blind, or _visually_ via visually-displayed text for individuals who are
deaf or are non-disabled. Text equivalents for _still images_ can be short
("Site Map Link") or long (e.g., "Figure 4 shows that the population of
bacteria doubled approximately every twenty hours over the first one
hundred hours, increasing from about 1000 per milliliter to about 32,000
per milliliter."). Text equivalents for _audio clips_ are called "text
transcripts." "Captions" are essential text equivalents for movie audio.
Captions consist of a text transcript of the audio track of the movie (or
other video presentation) that is synchronized with the video and audio
tracks. Captions are generally displayed visually and benefit people who
can see but are deaf, hard-of-hearing, or cannot hear the audio (e.g., in a
noisy room). Another essential text equivalent for a movie is a "collated
text transcript," which combines (collates) caption text with text
descriptions of video information (descriptions of the actions, body
language, graphics, and scene changes of the video track). Collated text
transcripts are essential for individuals who are deaf-blind and rely on
braille for access to movies and other content. An essential non-text
equivalent for movies is "auditory description" of the key _visual_
elements of a presentation. The description is either a prerecorded human
voice or a synthesized voice (recorded or generated on the fly). The
auditory description is synchronized with the audio track of the
presentation, usually during natural pauses in the audio track. Auditory
descriptions include information about actions, body language, graphics,
and scene changes.
====
Appendix: Definition of "Equivalent" From WCAG 1.0
Equivalent Content is "equivalent" to other content when both fulfill
essentially the same function or purpose upon presentation to the user. In
the context of this document, the equivalent must fulfill essentially the
same function for the person with a disability (at least insofar as is
feasible, given the nature of the disability and the state of technology),
as the primary content does for the person without any disability. For
example, the text "The Full Moon" might convey the same information as an
image of a full moon when presented to users. Note that equivalent
information focuses on fulfilling the same function. If the image is part
of a link and understanding the image is crucial to guessing the link
target, an equivalent must also give users an idea of the link target.
Providing equivalent information for inaccessible content is one of the
primary ways authors can make their documents accessible to people with
disabilities. As part of fulfilling the same function of content an
equivalent may involve a description of that content (i.e., what the
content looks like or sounds like). For example, in order for users to
understand the information conveyed by a complex chart, authors should
describe the visual information in the chart. Since text content can be
presented to the user as synthesized speech, braille, and
visually-displayed text, these guidelines require text equivalents for
graphic and audio information. Text equivalents must be written so that
they convey all essential content. Non-text equivalents (e.g., an auditory
description of a visual presentation, a video of a person telling a story
using sign language as an equivalent for a written story, etc.) also
improve accessibility for people who cannot access visual information or
written text, including many individuals with blindness, cognitive
disabilities, learning disabilities, and deafness. Equivalent information
may be provided in a number of ways, including through attributes (e.g., a
text value for the "alt" attribute in HTML and SMIL), as part of element
content (e.g., the OBJECT in HTML), as part of the document's prose, or via
a linked document (e.g., designated by the "longdesc" attribute in HTML or
a description link). Depending on the complexity of the equivalent, it may
be necessary to combine techniques (e.g., use "alt" for an abbreviated
equivalent, useful to familiar readers, in addition to "longdesc" for a
link to more complete information, useful to first-time readers). The
details of how and when to provide equivalent information are part of the
Techniques Document ([TECHNIQUES]). A text transcript is a text equivalent
of audio information that includes spoken words and non-spoken sounds such
as sound effects. A caption is a text transcript for the audio track of a
video presentation that is synchronized with the video and audio tracks.
Captions are generally rendered visually by being superimposed over the
video, which benefits people who are deaf and hard-of-hearing, and anyone
who cannot hear the audio (e.g., when in a crowded room). A collated text
transcript combines (collates) captions with text descriptions of video
information (descriptions of the actions, body language, graphics, and
scene changes of the video track). These text equivalents make
presentations accessible to people who are deaf-blind and to people who
cannot play movies, animations, etc. It also makes the information
available to search engines. One example of a non-text equivalent is an
auditory description of the key visual elements of a presentation. The
description is either a prerecorded human voice or a synthesized voice
(recorded or generated on the fly). The auditory description is
synchronized with the audio track of the presentation, usually during
natural pauses in the audio track. Auditory descriptions include
information about actions, body language, graphics, and scene changes.
=============================
Eric G. Hansen, Ph.D.
Development Scientist
Educational Testing Service
ETS 12-R
Rosedale Road
Princeton, NJ 08541
(W) 609-734-5615
(Fax) 609-734-1090
E-mail: ehansen@ets.org
--Charles McCathieNevile mailto:charles@w3.org
phone: +1 617 258 0992 http://www.w3.org/People/Charles
W3C Web Accessibility Initiative http://www.w3.org/WAI
MIT/LCS - 545 Technology sq., Cambridge MA, 02139, USA
Received on Wednesday, 20 October 1999 18:38:03 UTC