- From: eric hansen <ehansen@ets.org>
- Date: Wed, 20 Oct 1999 16:00:39 -0400 (EDT)
- To: w3c-wai-au@w3.org
- Cc: charles@w3.org, jutta.treviranus@utoronto.ca
Changes in Terms and Definitions I think that a few changes are needed to make definitions more consistent with WCAG 1.0. This message is a follow-up to my earlier message (http://lists.w3.org/Archives/Public/w3c-wai-au/1999OctDec/0043.html) "//Issue G-19: There is a cluster of terms that seems to be defined imprecisely or are used and not defined. Among the defined terms that need refinement are: accessibility information, alternative presentations, alternative information, accessible, accessibility. Among the undefined terms are: alternative representations, equivalent information, textual alternatives, audio descriptions, should be auditory descriptions, alternative information, accessibility information, descriptive text equivalents, text equivalent, caption, auditory description, collated text transcript, transcript, refreshable braille display(?), DTD,. Having so many terms with partly (or completely) overlapping meanings causes lack of clarity and usability of the document. These meanings need to be added, if necessary, and then clarified. Whenever possible this document should use terms already defined in the WCAG document.//" I don't think that all these issues cited in that memo have been fully addressed, but on my latest review, the following two stood out. 1. Video Captions Problem: The definition of "video captions" in the Terms section does not seem accurate. Its emphasis on describing "action" as well as "dialog" seems quite inaccurate. I have changed the text to correct those problems. I have also inserted it into the definition of "Alternative Presentations and Alternative Information" since the other, parallel terms ("collated text transcript", "auditory description", etc.) do not seem to be broken out separately. Old (10/14/99): Video Captions A video caption is a textual message that is stored in the text track of a video file. The video caption describes the action and dialog for the scene in which it is displayed. Proposed: "Captions" are essential text equivalents for movie audio. Captions consist of a text transcript of the audio track of the movie (or other video presentation) that is synchronized with the video and audio tracks. Captions are generally displayed visually and benefit people who can see but are deaf, hard-of-hearing, or cannot hear the audio (e.g., in a noisy room). Note 1: For comparison, see the following from the definition of "equivalent" in WCAG 1.0.: "A caption is a text transcript for the audio track of a video presentation that is synchronized with the video and audio tracks. Captions are generally rendered visually by being superimposed over the video, which benefits people who are deaf and hard-of-hearing, and anyone who cannot hear the audio (e.g., when in a crowded room)." Note 2: If the individual definitions were broken out, it might be something like the following: "Captions"[:]"Captions consist of a text transcript of the auditory track of a video presentation (e.g., movie) that is synchronized with the visual and auditory tracks. Captions are generally displayed visually and benefit people who can see but are deaf, hard-of-hearing, or cannot hear the audio (e.g., in a noisy room)." ==== 2. Alternative Presentations and Alternative Information Problem: The definition of "Alternative Presentations and Alternative Information" should be clarified to make it more consistent with the WCAG 1.0 document. The phrase "captions of the audio" seems misleading since captions are relevant for the auditory track of a movie but not for ordinary audio clips, for which the term "text transcript" is used. The apparent reference to human population growth must be fictional; I know of no geographical region that has had such growth. Hence, I provided a different example that I think might be more feasible. The reference to "textual representations of each of these" is inaccurate since some of the examples are already "textual representations." Old (10/14/99): Alternative Presentations and Alternative Information Certain types of content may not be accessible to all users (e.g., images or audio presentations), so alternative representations are used, such as transcripts for audio, or short functionally equivalent text (e.g., "site map link") and/or descriptive text equivalents (e.g., "Graph 2.5 shows that the population has doubled approximately every ten years for the last fifty years, increasing from about 10 million to 330 million in that time"). An object may have several alternative representations, for example a video, captions of the audio, audio description of the video, a series of still images, and textual representations of each of these. Proposed: The following version brings the material into closer conformity to the WCAG 1.0 document. Alternative Presentations and Alternative Information Certain types of content may not be accessible to all users, so alternative representations are used. For example, "text equivalents" are provided for all non-text elements because text can be flexibly output in ways that are usable to diverse populations. Specifically, text can be output _auditorially_ via synthesized speech for individuals who have visual or learning disabilities, _tactually_ via braille for individuals who are blind, or _visually_ via visually-displayed text for individuals who are deaf or are non-disabled. Text equivalents for _still images_ can be short ("Site Map Link") or long (e.g., "Figure 4 shows that the population of bacteria doubled approximately every twenty hours over the first one hundred hours, increasing from about 1000 per milliliter to about 32,000 per milliliter."). Text equivalents for _audio clips_ are called "text transcripts." "Captions" are essential text equivalents for movie audio. Captions consist of a text transcript of the audio track of the movie (or other video presentation) that is synchronized with the video and audio tracks. Captions are generally displayed visually and benefit people who can see but are deaf, hard-of-hearing, or cannot hear the audio (e.g., in a noisy room). Another essential text equivalent for a movie is a "collated text transcript," which combines (collates) caption text with text descriptions of video information (descriptions of the actions, body language, graphics, and scene changes of the video track). Collated text transcripts are essential for individuals who are deaf-blind and rely on braille for access to movies and other content. An essential non-text equivalent for movies is "auditory description" of the key _visual_ elements of a presentation. The description is either a prerecorded human voice or a synthesized voice (recorded or generated on the fly). The auditory description is synchronized with the audio track of the presentation, usually during natural pauses in the audio track. Auditory descriptions include information about actions, body language, graphics, and scene changes. ==== Appendix: Definition of "Equivalent" From WCAG 1.0 Equivalent Content is "equivalent" to other content when both fulfill essentially the same function or purpose upon presentation to the user. In the context of this document, the equivalent must fulfill essentially the same function for the person with a disability (at least insofar as is feasible, given the nature of the disability and the state of technology), as the primary content does for the person without any disability. For example, the text "The Full Moon" might convey the same information as an image of a full moon when presented to users. Note that equivalent information focuses on fulfilling the same function. If the image is part of a link and understanding the image is crucial to guessing the link target, an equivalent must also give users an idea of the link target. Providing equivalent information for inaccessible content is one of the primary ways authors can make their documents accessible to people with disabilities. As part of fulfilling the same function of content an equivalent may involve a description of that content (i.e., what the content looks like or sounds like). For example, in order for users to understand the information conveyed by a complex chart, authors should describe the visual information in the chart. Since text content can be presented to the user as synthesized speech, braille, and visually-displayed text, these guidelines require text equivalents for graphic and audio information. Text equivalents must be written so that they convey all essential content. Non-text equivalents (e.g., an auditory description of a visual presentation, a video of a person telling a story using sign language as an equivalent for a written story, etc.) also improve accessibility for people who cannot access visual information or written text, including many individuals with blindness, cognitive disabilities, learning disabilities, and deafness. Equivalent information may be provided in a number of ways, including through attributes (e.g., a text value for the "alt" attribute in HTML and SMIL), as part of element content (e.g., the OBJECT in HTML), as part of the document's prose, or via a linked document (e.g., designated by the "longdesc" attribute in HTML or a description link). Depending on the complexity of the equivalent, it may be necessary to combine techniques (e.g., use "alt" for an abbreviated equivalent, useful to familiar readers, in addition to "longdesc" for a link to more complete information, useful to first-time readers). The details of how and when to provide equivalent information are part of the Techniques Document ([TECHNIQUES]). A text transcript is a text equivalent of audio information that includes spoken words and non-spoken sounds such as sound effects. A caption is a text transcript for the audio track of a video presentation that is synchronized with the video and audio tracks. Captions are generally rendered visually by being superimposed over the video, which benefits people who are deaf and hard-of-hearing, and anyone who cannot hear the audio (e.g., when in a crowded room). A collated text transcript combines (collates) captions with text descriptions of video information (descriptions of the actions, body language, graphics, and scene changes of the video track). These text equivalents make presentations accessible to people who are deaf-blind and to people who cannot play movies, animations, etc. It also makes the information available to search engines. One example of a non-text equivalent is an auditory description of the key visual elements of a presentation. The description is either a prerecorded human voice or a synthesized voice (recorded or generated on the fly). The auditory description is synchronized with the audio track of the presentation, usually during natural pauses in the audio track. Auditory descriptions include information about actions, body language, graphics, and scene changes. ============================= Eric G. Hansen, Ph.D. Development Scientist Educational Testing Service ETS 12-R Rosedale Road Princeton, NJ 08541 (W) 609-734-5615 (Fax) 609-734-1090 E-mail: ehansen@ets.org
Received on Wednesday, 20 October 1999 16:15:56 UTC