FW: [media] alt technologies for paused video (and using ARIA)

On Thu, May 12, 2011 at 12:35 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com> wrote:

> Because if you have a black image sitting there the text alternative
> surely shouldn't say that it's an Apollo launch. That would be the
> wrong description of that image.

Exactly. We're not concerned about letting the user know that there's
a black image there. We're concerned about letting them know that it's
a video about the Apollo launch... which is what sighted users can
easily surmise from the visual presentation and context of that video.

>>> Certainly a short alternative presenting the
>>> content of what the video is would be useful for accessibility for
>>> screen reader users (sighted users can, after all, use the entire
>>> visual context to more likely determine the video's content).
> What is the entire visual context? If there is text given underneath,
> it is also accessible to the blind user, in particular if it is
> referenced through aria-describedby.
> We cannot assume anything about
> the rest of the page when we describe the paused video.

Here's the ambiguity again. Are we describing "the paused video" (the
thing that the user might play) or are we describing the poster frame
(the static image that might, but often doesn't provide additional
information about the thing the user might play)? Or both? In this
case (a black frame), there is no poster frame content to describe,
but the paused video is still there and I believe it needs a short

In one sentence you indicate that we should consider the context, but
the next sentence suggests we should ignore it. I believe context is
vital in determining the alternative for any non-text element. In this
case, if the visual context clearly presents to sighted users that the
video is about the Apollo launch, I would think it important to also
present this to a screen reader user.

> What if only
> the black frame is sitting there and nothing else? Would you still
> describe that as "Apollo launch"?

Yes, if it is presented visually in the surrounding context that the
video is about the Apollo launch.

This becomes even more significant for users that are navigating by
interactive elements. They would likely skip all descriptive text
content and jump directly to the video - which you would have present
no descriptive content until it is played. If there were multiple
videos, a screen reader might read "video, video, video". This would
be somewhat akin to "click here" which makes sense in its visual
context but requires screen reader users to explore the context to
determine what it is. And as noted before, this would bypass the WCAG
SC 1.1.1 requirements for descriptive identification of time-based
media. A short alternative to the video removes all these issues.

>>> Now consider that the poster frame (whether author defined, random, or
>>> first frame) is an image of the moon, though the video is primarily
>>> about the Apollo 11 launch. A short alternative of "The moon" (or
>>> similar) would be an appropriate alternative for the poster frame, but
>>> would provide little utility (and, in this case, false information)
>>> about what the content of the video actually is.
> No, it wouldn't. The sighted user doesn't get more information either.

Sure they do. They can see the entire context of the video to
determine what the video is about. Now a screen reader user could read
before or after (which one is a crap shoot), but with much more
effort. Would we ever omit @alt on an image on a page about the Apollo
mission based on the assumption that the screen reader user can figure
out what it is based on its context? Of course not! Then why would we
omit it for a video in the same place?

> You have to always assume there is nothing else on the page when you
> define text alternatives for an element.

I strongly disagree (http://webaim.org/techniques/alttext/#context).
The same non-text element may have very different alternatives
depending on its context. This is likely the crux of this issue.

There's more to a video than what is presented visually when it's not
playing - just like there's often more to alternative text than what
the image looks like.

>>> This then seems to call for up to 5 (yikes!) types of alternative:
>>> 1. Short alternative for the <video>
> That's not necessary, because we have @transcription, track and other
> page text for this (always assuming you mean the playing video here).

@transcription and track are certainly alternatives to the playing
video. But these wouldn't be available until the video is activated.
Again, if page context describes what the video is going to play to
sighted users, this information should also be presented to screen
reader users in a short alternative.

>>> 2. Long alternative for the <video> (if necessary)
> That's what @transcription (off-page) and @aria-describedby (on-page)


Of note is that aria-describedby does not currently (and probably
won't ever) support structured content or interactive elements. As
Steve kindly informed me, it's mapped to the accdescription property
which is a text string. This introduces some limitations for when long
description needs to provide structured content, which lends itself to
@transcription or @longdesc or... something.

>>> 3. Short alternative for the poster image (if necessary, when not
>>> identical to #1)
> Yes, that's what my use case number one is and what I suggested
@aria-label for.

Except that it's currently ambiguous as to what should be described -
the video or the poster frame.

>>> 4. Long alternative for the poster image (if necessary, though I think
>>> this would be somewhat rare)
> That could easily be part of the long alternative for the <video>.

But you said previously that we only need short or long alternatives
for the poster frame, not for the <video>. See why I'm confused? :-)

This really is a great discussion. I agree that the issue is primarily
about terminology and explaining what should be described and how.

Jared Smith

Received on Thursday, 12 May 2011 17:40:14 UTC