Changes in Terms and Definitions from eric hansen on 1999-10-20 (w3c-wai-au@w3.org from October to December 1999)

From: eric hansen <ehansen@ets.org>
Date: Wed, 20 Oct 1999 16:00:39 -0400 (EDT)
To: w3c-wai-au@w3.org
Cc: charles@w3.org, jutta.treviranus@utoronto.ca
Message-id: <vines.Bh0E+bxV1sA@cips06.ets.org>
Changes in Terms and Definitions

I think that a few changes are needed to make definitions more consistent 
with WCAG 1.0. This message is a follow-up to my earlier message 
(http://lists.w3.org/Archives/Public/w3c-wai-au/1999OctDec/0043.html)

"//Issue G-19: There is a cluster of terms that seems to be defined 
imprecisely or are used and not defined. Among the defined terms that need 
refinement are: accessibility information, alternative presentations, 
alternative information, accessible, accessibility. Among the undefined 
terms are: alternative representations, equivalent information, textual 
alternatives, audio descriptions, should be auditory descriptions, 
alternative information, accessibility information, descriptive text 
equivalents, text equivalent, caption, auditory description, collated text 
transcript, transcript, refreshable braille display(?), DTD,. Having so 
many terms with partly (or completely) overlapping meanings causes lack of 
clarity and usability of the document. These meanings need to be added, if 
necessary, and then clarified. Whenever possible this document should use 
terms already defined in the WCAG document.//"

I don't think that all these issues cited in that memo have been fully 
addressed, but on my latest review, the following two stood out.

1. Video Captions

Problem:

The definition of "video captions" in the Terms section does not seem 
accurate. Its emphasis on describing "action" as well as "dialog" seems 
quite inaccurate. I have changed the text to correct those problems. I have 
also inserted it into the definition of "Alternative Presentations and 
Alternative Information" since the other, parallel terms ("collated text 
transcript", "auditory description", etc.) do not seem to be broken out 
separately.

Old (10/14/99):

Video Captions
A video caption is a textual message that is stored in the text track of a 
video file. The video caption describes the action and dialog for the scene 
in which it is displayed.

Proposed:
 
"Captions" are essential text equivalents for movie audio. Captions consist 
of a text transcript of the audio track of the movie (or other video 
presentation) that is synchronized with the video and audio tracks. 
Captions are generally displayed visually and benefit people who can see 
but are deaf, hard-of-hearing, or cannot hear the audio (e.g., in a noisy 
room).

Note 1: For comparison, see the following from the definition of 
"equivalent" in WCAG 1.0.: "A caption is a text transcript for the audio 
track of a video presentation that is synchronized with the video and audio 
tracks. Captions are generally rendered visually by being superimposed over 
the video, which benefits people who are deaf and hard-of-hearing, and 
anyone who cannot hear the audio (e.g., when in a crowded room)."

Note 2: If the individual definitions were broken out, it might be 
something like the following: "Captions"[:]"Captions consist of a text 
transcript of the auditory track of a video presentation (e.g., movie) that 
is synchronized with the visual and auditory tracks. Captions are generally 
displayed visually and benefit people who can see but are deaf, 
hard-of-hearing, or cannot hear the audio (e.g., in a noisy room)."

====

2. Alternative Presentations and Alternative Information

Problem:

The definition of  "Alternative Presentations and Alternative Information" 
should be clarified to make it more consistent with the WCAG 1.0 document. 
The phrase "captions of the audio" seems misleading since captions are 
relevant for the auditory track of a movie but not for ordinary audio clips,
 for which the term "text transcript" is used. The apparent reference to 
human population growth must be fictional; I know of no geographical region 
that has had such growth. Hence, I provided a different example that I 
think might be more feasible. The reference to "textual representations of 
each of these" is inaccurate since some of the examples are already 
"textual representations."

Old (10/14/99):

Alternative Presentations and Alternative Information
Certain types of content may not be accessible to all users (e.g., images 
or audio presentations), so alternative representations are used, such as 
transcripts for audio, or short functionally equivalent text (e.g., "site 
map link") and/or descriptive text equivalents (e.g., "Graph 2.5 shows that 
the population has doubled approximately every ten years for the last fifty 
years, increasing from about 10 million to 330 million in that time"). An 
object may have several alternative representations, for example a video, 
captions of the audio, audio description of the video, a series of still 
images, and textual representations of each of these.

Proposed:

The following version brings the material into closer conformity to the 
WCAG 1.0 document.

Alternative Presentations and Alternative Information
Certain types of content may not be accessible to all users, so alternative 
representations are used. For example, "text equivalents" are provided for 
all non-text elements because text can be flexibly output in ways that are 
usable to diverse populations. Specifically, text can be output 
_auditorially_ via synthesized speech for individuals who have visual or 
learning disabilities, _tactually_ via braille for individuals who are 
blind, or _visually_ via visually-displayed text for individuals who are 
deaf or are non-disabled. Text equivalents for _still images_ can be short 
("Site Map Link") or long (e.g., "Figure 4 shows that the population of 
bacteria doubled approximately every twenty hours over the first one 
hundred hours, increasing from about 1000 per milliliter to about 32,000 
per milliliter."). Text equivalents for _audio clips_ are called "text 
transcripts." "Captions" are essential text equivalents for movie audio. 
Captions consist of a text transcript of the audio track of the movie (or 
other video presentation) that is synchronized with the video and audio 
tracks. Captions are generally displayed visually and benefit people who 
can see but are deaf, hard-of-hearing, or cannot hear the audio (e.g., in a 
noisy room). Another essential text equivalent for a movie is a "collated 
text transcript," which combines (collates) caption text with text 
descriptions of video information (descriptions of the actions, body 
language, graphics, and scene changes of the video track). Collated text 
transcripts are essential for individuals who are deaf-blind and rely on 
braille for access to movies and other content. An essential non-text 
equivalent for movies is "auditory description" of the key _visual_ 
elements of a presentation. The description is either a prerecorded human 
voice or a synthesized voice (recorded or generated on the fly). The 
auditory description is synchronized with the audio track of the 
presentation, usually during natural pauses in the audio track. Auditory 
descriptions include information about actions, body language, graphics, 
and scene changes.

====
Appendix: Definition of "Equivalent" From WCAG 1.0 

Equivalent Content is "equivalent" to other content when both fulfill 
essentially the same function or purpose upon presentation to the user. In 
the context of this document, the equivalent must fulfill essentially the 
same function for the person with a disability (at least insofar as is 
feasible, given the nature of the disability and the state of technology), 
as the primary content does for the person without any disability. For 
example, the text "The Full Moon" might convey the same information as an 
image of a full moon when presented to users. Note that equivalent 
information focuses on fulfilling the same function. If the image is part 
of a link and understanding the image is crucial to guessing the link 
target, an equivalent must also give users an idea of the link target. 
Providing equivalent information for inaccessible content is one of the 
primary ways authors can make their documents accessible to people with 
disabilities. As part of fulfilling the same function of content an 
equivalent may involve a description of that content (i.e., what the 
content looks like or sounds like). For example, in order for users to 
understand the information conveyed by a complex chart, authors should 
describe the visual information in the chart. Since text content can be 
presented to the user as synthesized speech, braille, and 
visually-displayed text, these guidelines require text equivalents for 
graphic and audio information. Text equivalents must be written so that 
they convey all essential content. Non-text equivalents (e.g., an auditory 
description of a visual presentation, a video of a person telling a story 
using sign language as an equivalent for a written story, etc.) also 
improve accessibility for people who cannot access visual information or 
written text, including many individuals with blindness, cognitive 
disabilities, learning disabilities, and deafness. Equivalent information 
may be provided in a number of ways, including through attributes (e.g., a 
text value for the "alt" attribute in HTML and SMIL), as part of element 
content (e.g., the OBJECT in HTML), as part of the document's prose, or via 
a linked document (e.g., designated by the "longdesc" attribute in HTML or 
a description link). Depending on the complexity of the equivalent, it may 
be necessary to combine techniques (e.g., use "alt" for an abbreviated 
equivalent, useful to familiar readers, in addition to "longdesc" for a 
link to more complete information, useful to first-time readers). The 
details of how and when to provide equivalent information are part of the 
Techniques Document ([TECHNIQUES]). A text transcript is a text equivalent 
of audio information that includes spoken words and non-spoken sounds such 
as sound effects. A caption is a text transcript for the audio track of a 
video presentation that is synchronized with the video and audio tracks. 
Captions are generally rendered visually by being superimposed over the 
video, which benefits people who are deaf and hard-of-hearing, and anyone 
who cannot hear the audio (e.g., when in a crowded room). A collated text 
transcript combines (collates) captions with text descriptions of video 
information (descriptions of the actions, body language, graphics, and 
scene changes of the video track). These text equivalents make 
presentations accessible to people who are deaf-blind and to people who 
cannot play movies, animations, etc. It also makes the information 
available to search engines. One example of a non-text equivalent is an 
auditory description of the key visual elements of a presentation. The 
description is either a prerecorded human voice or a synthesized voice 
(recorded or generated on the fly). The auditory description is 
synchronized with the audio track of the presentation, usually during 
natural pauses in the audio track. Auditory descriptions include 
information about actions, body language, graphics, and scene changes. 


=============================
Eric G. Hansen, Ph.D.
Development Scientist
Educational Testing Service
ETS 12-R
Rosedale Road
Princeton, NJ 08541
(W) 609-734-5615
(Fax) 609-734-1090
E-mail: ehansen@ets.org
Received on Wednesday, 20 October 1999 16:15:56 UTC