Re: Issues: Part 2 - #16 through #43 - Comments on Comments from ehansen@ets.org on 1999-11-24 (w3c-wai-ua@w3.org from October to December 1999)

From: <ehansen@ets.org>
Date: Wed, 24 Nov 1999 18:33:43 -0500 (EST)
To: w3c-wai-ua@w3.org
Message-id: <vines.Bh0E+LL5DsA@cips06.ets.org>
Ian Jacobs and Madeleine Rothberg had concerns and comments about my Issue 
#16 ("Eliminate the use of  "continuous equivalent track"."

I will try to address a few of these.

> Issue #16. Eliminate the use of  "continuous equivalent track".
> 
> To the best of my knowledge, this is a new term that is not necessary.
> Unless someone can point out why it is necessary, then I suggest that it 
be
> removed. It is used in very few locations.

Ian Jacobs wrote:
"We started using "continuous equivalent" in the SMIL access note [1]. We 
distinguish there between discrete equivalent (like "alt") and one that 
must be synchronized with another track.

[1] http://www.w3.org/TR/SMIL-access/
 
> I will point out just a few of the problems in the definition of
> "continuous equivalent track".
> 
> The old version reads:
> 
> "Continuous Equivalent Track"
> "A continuous equivalent track presents an equivalent alternative to
> another track (generally audio or video) and is synchronized with that
> track. Continuous equivalent tracks convey information about spoken words
> and non-spoken sounds such as sound effects. A continuous text track
> presents closed captions. Captions are generally rendered visually by 
being
> superimposed over a video track, which benefits people who are deaf and
> hard-of-hearing, and anyone who cannot hear the audio (e.g., when in a
> crowded room). A collated text transcript combines (collates) captions 
with
> text descriptions of video information (descriptions of the actions, body
> language, graphics, and scene changes of the video track). These text
> equivalents make presentations accessible to people who are deaf-blind 
and
> to people who cannot play movies, animations, etc."
> "One example of a non-text continuous equivalent track is an auditory
> description of the key visual elements of a presentation. The description
> is either a prerecorded human voice or a synthesized voice (recorded or
> generated on the fly). The auditory description is synchronized with the
> audio track of the presentation, usually during natural pauses in the 
audio
> track. Auditory descriptions include information about actions, body
> language, graphics, and scene changes."
> "A video track that shows sign language is another example of a 
continuous
> equivalent track."
> 
> Some of the problems:
> 
> a. One of the biggest problems is that the reference to collated text
> transcripts is totally out of place because it is not a "continuous
> equivalent track."

IJ: Agreed. Could be deleted.

> b. The above definition cites auditory description as an example of
> "non-text continuous equivalent track". Yet in this context, "non-text"
> seems misleading because it could be derived ("on the fly") from text.

Ian Jacobs wrote:

"But derived from text is an implementation detail. The end product must be 
audio."

Eric Hansen (24 Nov): Ok, though other reservations about continuous 
equivalent track as elaborated below.

> c. A "video track that shows sign language" is cited as an example of a
> continuous equivalent track, but it is not necessarily "equivalent" to
> anything (except itself) since it could be part of the "primary" content.
> It needs to be cited as equivalent of (or to or for) something else. 
Unless
> we want to nudge WCAG to requiring non-text equivalents (other than
> auditory descriptions), then I suggest removing reference to "video of 
sign
> language" in checkpoint 2.5. The Priority 1 rating of this implies that 
it
> must be done, yet there is no WCAG requirement for non-text equivalents 
for
> "primary content". The checkpoints themselves should only mention what we
> are asserting is required.

IJ: I don't agree. I think that the text clearly states that this is
an example of a continuous equivalent track.

MR: I agree with Ian. While it is true that a sign language video might be
primary content, it might also be a continuous equivalent. The same is
true for text: a text track might be primary content or it might be a 
closed
caption track. That is why SMIL has a system attribute for indicating when
a text track is a closed captions track. SMIL does not yet have an 
attribute
for indicating when video is continuous equivalent, such as a sign language
translation, so we are left with the decision of whether we can (priority 
1) require
that UAs allow users to turn on and off tracks that SMIL does not yet allow
to be delineated as continuous equivalents.  This also applies to Eric's 
Issue 17.

On a related note, the new SMIL public working draft [1] does propose a 
system
attribute for audio descriptions, so I will be submitting revised 
techniques that
tell UAs how to use that attribute to provide improved accessibility for 
blind and
visually impaired users. If the UA WG feels we can encourage UA developers 
to
stretch their features past the existing or recommended SMIL features, I 
would
personally be in favor of encouraging implementation of ways to choose
continuous equivalent video, such as sign language, in UAs.
[1] http://www.w3.org/TR/smil-boston/

EH (24 Nov): I still have serious concerns about referring to "continuous 
equivalents" unless the definition can be considerably refined. I 
personally don't think that it is necessary to the document and that if it 
is not well-defined, it could cause confusion. Keep in mind that 
equivalents can be text elements or non-text elements. Does a collated text 
transcript constitute one or a pair of "continuous equivalents"? One could 
argue that it does, but I don't think that was what was intended. Does 
collation count as synchronization? Do continuous equivalents apply to 
multiple written language? How far can a pair of equivalents get out of 
synch with each other and still be counted as "continuous" (I have argued 
that a spec for on the fly auditory descriptions must allow pausesþ)?

Regarding the reference to "sign language videos" as an example of 
continuous equivalent track. (By the way, I do not sign but have directed 
US Dept of Education projects related to sign language videos.) If one 
wants to simply include a reference to sign language videos, it needs to be 
clear they are _equivalent to something else_. For example, instead of 
saying: "A video track that shows sign language is another example of a 
continuous equivalent track.", it is tempting to say "A video track that 
shows sign language translation of a written text is another example of a 
continuous equivalent track.", but is it really? Doesn't the concept of 
continuous equivalent track suggest that the sign language video is 
_synchronized_ with the text. But it is just a blob of written text and is 
really, in this case, not synchronized at all with the sign language video.

In summary, I think that that definition of continuous equivalent as used 
in UAAG (5 Nov) is flawed and that the examples are problematic. I think 
that what you are trying to tackle with the construct of "continuous 
equivalent track" seems to be a combination of: (a) motion versus still 
rendering, (b) static versus time-based media, (c) synchronized versus 
independent tracks, (d) element types (e.g., text elements vs. non-text 
elements), (e) equivalency versus non-equivalency. Until there is greater 
consensus and clarity on the meaning of the term and its usefulness in the 
UAAG document, I strongly recommend eliminating it.

Regarding Madeleine's note:

"On a related note, the new SMIL public working draft [1] does propose a 
system attribute for audio descriptions, so I will be submitting revised 
techniques that tell UAs how to use that attribute to provide improved 
accessibility for blind and visually impaired users. If the UA WG feels we 
can encourage UA developers to stretch their features past the existing or 
recommended SMIL features, I would personally be in favor of encouraging 
implementation of ways to choose continuous equivalent video, such as sign 
language, in UAs. [1] http://www.w3.org/TR/smil-boston/"

EH (24 Nov): I am not sure if Madeleine is referring to prerecorded 
auditory descriptions or on the fly auditory descriptions or both. I think 
it is important that user agents support both.

EH wrote:

> I recommend that we use a term such as "synchronized alternatives" or
> "synchronized alternative tracks", both of which are similar to WCAG
> checkpoint 1.4, which reads: "For any time-based multimedia presentation
> (e.g., a movie or animation), synchronize equivalent alternatives (e.g.,
> captions or auditory descriptions of the visual track) with the
> presentation. [Priority 1]". 

Ian asked: 

"I have no objection to synchronized equivalent. Is there any reason why a 
continuous equivalent would not be synchronized?" 

EH (24 Nov 99): See earlier discussion of continuous equivalents. Please 
delete reference to "continuous equivalents".

====
***
Issue #37. Reconsider the use of the term "visual impairment".
> 
> In our organization, the term is considered insensitive (unfair). Use
> "visual disability". The preferred terms can change, but keeping up with
> the preferred terms is important.

IJ: Should we drop "impairment" everywhere?

MR: my organization does use the term visually impaired, but
if others feel it is out of date WAI could decide to drop it. 

MR: I can't locate the mail where this came up, and I can't remember in
which direction the current document comes down, but braille is not a
language. It is a way of writing English. I believe in the past we have 
talked
about style sheets for printed and refreshable braille as equivalents to 
style
sheets that differ for screen versus ink print versus small screen displays 
 --
 in all cases, still in English. ASL, however, is a natural language of its 
own.

EH (24 Nov 99): Evidently the term "disability" appears preferred for 
general use and appears in legislation such as the Americans with 
Disabilities Act. Nevertheless "impairment" is evidently not forbidden and 
one editor thought that it was OK if preceded with a specific (e.g., 
"hearing impaired").  I think that the term disability brings it into 
closer conformity to WCAG and ATAG. I think it would not hurt to change 
them all and it might help.
====
****
EH: > Issue #21.  Refine definition of "text transcript."
> 
> This definition needs to be clarified and corrected. I think that the
> reference to "continuous equivalent track" is erroneous and misleading
> because a text transcript, as currently defined, does not include
> synchronization information in the sense that captions do.
> 
> Old:
> 
> "Text transcript"
> "A text transcript is a text equivalent of audio information that 
includes
> spoken words and non-spoken sounds such as sound effects. Refer also to
> continuous equivalent track."
> 
> New:
> 
> "Text transcript"
> "In the context of this document, a text transcript is a text equivalent 
of
> an audio information (e.g., audio clip). It provides text for both speech
> and non-speech sounds."

IJ: Why only "in the context of this document"? How about:
"A text transcript is a text equivalent of auditory information (e.g.,
an audio clip). A text transcript documents both speech and non-spoken
sounds such as sound effects."

MR: I think the reason for the phrase "Refer also to continuous equivalent 
track."
was as a contrast, not because the text transcript is a continuous 
equivalent.
Perhaps change to "A text transcript may contain content which is similar
or identical to the content of closed captions, but it is generally 
available in
a single time-independent document rather than as a continuous equivalent."

MR: I can't locate the first mention of the "captions vs closed captions" 
issue.
I'd like to weigh in that I think the term "closed captions" is useful in
distinuishing between information intended to replace audio tracks, and 
typically intended for use by people who are deaf, and any other thing 
called
a caption, such as a photo caption or table caption. Because closed 
captioning
is quite well known, I think it is helpful to continue using that term in 
that
context. And since other types of captions can get confused, I think it is
helpful to always use the extra word to indicate the type of caption being
discussed. For example, when we discuss the usefulness of table captions
for web surfers who are blind, we don't want people saying "Why do blind
people need captions?"

EH (24 Nov 99): Again, the think that the definition of continuous 
equivalent has too many wrinkles to work out to have it included in the 
UAAG. For example, the proposed definition of text equivalent has a few 
problems (""A text transcript may contain content which is similar or 
identical to the content of closed captions, but it is generally available 
in a single time-independent document rather than as a continuous 
equivalent.")

a. "single time-independent document" is not defined, to best of my 
knowledge. Particularly, for example, the term "time-independent" document 
is not clear.
b. Our usage of the term "text transcript" is quite distinct from that of 
"(closed) captions", so to say that it is "similar or identical" does not 
seem accurate.

Its not so much that the idea of "continuous equivalent" cannot be a viable 
term, but I think that it is not needed and we ought not to invent new 
terms when old ones will do.

Regarding the term "captions"þ I can see why you want to say "closed 
captions" instead of just "captions" and it may be that the term "closed 
captions" is the best alternative, though I think that it unfortunate if 
that is the case. The term "closed captions" comes from TV and the name 
implies that it is invisible or hidden. It begs the question of what "open 
captions" are. Evidently one of the distinguishing features of closed 
captions is that they can be turned off (see 
http://www.robson.org/capfaq/overview.html#What). Yet the capability of 
turning something on or off is a separate issue in the WCAG/UAAG conceptual 
framework. I wish there were a better alternative.

Regarding Ian's question ("Why only "in the context of this document"? How 
about: "A text transcript is a text equivalent of auditory information 
(e.g., an audio clip). A text transcript documents both speech and 
non-spoken sounds such as sound effects."), I prefer to say "In the context 
of this document, a text transcriptþ", since I doubt that the world defines 
the term so narrowly. The phrase signals that this is specialized usage. I 
don't think we invented the term. Isn't the term used generally to refer to 
transcripts that are actually closer to "collated text transcripts".
====
> Issue #19. The UAAG document cannot be imported into MS Word 97 SR-1.
> 
> It seems to me that a while ago I didn't have trouble importing the WAI
> guideline documents into MS Word 97. Now the opening of the file fails 
and
> there an indication that the file is corrupted. Can anyone tell me how to
> overcome this problem?

IJ: "Are you importing the HTML? I'd be curious to know how to fix this
so we don't get lots of complaints after publication. (We've had
problems with people printing the PDF of WCAG, for example, and I'd
like to solve that one too)."

EH: Yes, importing ("opening").
====


=============================
Eric G. Hansen, Ph.D.
Development Scientist
Educational Testing Service
ETS 12-R
Rosedale Road
Princeton, NJ 08541
(W) 609-734-5615
(Fax) 609-734-1090
E-mail: ehansen@ets.org
Received on Wednesday, 24 November 1999 18:38:45 UTC