RE: [media] alt technologies for paused video (and using ARIA) from John Foliot on 2011-05-18 (public-html-a11y@w3.org from May 2011)

From: John Foliot <jfoliot@stanford.edu>
Date: Wed, 18 May 2011 09:20:02 -0700 (PDT)
To: "'Silvia Pfeiffer'" <silviapfeiffer1@gmail.com>
Cc: "'David Singer'" <singer@apple.com>, "'HTML Accessibility Task Force'" <public-html-a11y@w3.org>, "'James Craig'" <jcraig@apple.com>, "'Michael Cooper'" <cooper@w3.org>
Message-ID: <01c401cc1577$7054f810$50fee830$@edu>
Silvia Pfeiffer wrote:
>
> On Thu, May 12, 2011 at 9:29 AM, John Foliot <jfoliot@stanford.edu>
> wrote:
> > David Singer wrote:
> >>
> >> I think the point is that the poster and the aria-label are both
> about
> >> the video (they are peers)
> >
> > Exactly.
>
> That's exactly the kind of impression that I am trying to avoid. There
> is no need to provide a label on the video - @id does that perfectly
> well.

I think you may be mistaken here.  <div id="navigation"> does not map to
any accessibility API, which is why we have <div id="navigation"
role="navigation"> and/or now in HTML5, <nav>. So while ID does in fact
label an element in the code (for scripting and styling), it does not map
it to the accessibility layer.  If there is a need to provide a label on
the <video> element, then you can use aria-label, and then the video
element will be mapped (named) to the a11y API: it is not however an
appropriate mechanism for providing textual alternatives, as ARIA first
and foremost is about the accessibility mapping and providing for
interoperability with assistive technologies.



> There is also no need to provide an alternative description of
> the video content: @alt is not there to provide a summary of the video
> content. Text alternatives are about providing text alternatives to
> what only a sighted user sees and we are talking about the pause
> situation here, so the video and the poster are identical in this
> situation.

Silvia, I am afraid that your understanding of textual alternatives is
somewhat simplistic. From "HTML5: Techniques for providing useful text
alternatives" (http://www.w3.org/TR/html-alt-techniques/):

"Text alternatives are a primary way of making visual information
accessible, because they can be rendered through any sensory modality (for
example, visual, auditory or tactile) to match the needs of the user.
Providing text alternatives allows the information to be rendered in a
variety of ways by a variety of user agents. For example, a person who
cannot see a picture can have the text alternative read aloud using
synthesized speech.

To determine appropriate text alternatives it is important to think about
why an image is being included in a document. What is its purpose?
Thinking like this will help you to understand what is important about the
image for the page's intended audience. Every image has a reason for being
on a page, because it provides useful information, performs a function, or
enhances aesthetics. Therefore, knowing what the image is for, makes
writing appropriate text alternatives easier.

Examples of scenarios where users benefit from text alternatives for
images
  * They have a very slow connection and are browsing with images
disabled.
  * They have a vision impairment and use text to speech software.
  * They have a cognitive impairment and use text to speech software.
  * They are using a text-only browser.
  * They are listening to the page being read out by a voice Web browser.
  * They have images disabled to save on download costs.

General Text Alternative Good Practices
  * Provide the same informational content as the image.
  * Where an image performs a specific function, such as a graphical link,
provide information about its functionality.
  * Be succinct as possible while still conveying equivalent values. Short
text that describes its purpose or gives an overview will often suffice.
  * Write suitable alt text according to context. The same image in a
different situation may need very different alt text.
  * Avoid redundant alt text. An example of this would be repeating the
same text in your document, as well as in the alt attribute, and is
unnecessary."


>
>
> >>  so it might be better to say
> >>
> >>  <video poster="media/ClockworkOrangetrailer.jpg" controls
> >>         aria-label="A Clockwork Orange movie trailer">
> >>    <source src="media/ClockworkOrangetrailer.mp4">
> >>    <source src="media/ClockworkOrangetrailer.webm">
> >>    <source src="media/ClockworkOrangetrailer.ogv">
> >>  </video>
> >
> > For accessibility API mapping, naming the <video> object as "A
> Clockwork
> > Orange movie trailer" would be acceptable,
>
> No, it's not acceptable - it provides more information than the
> placeholder frame provides, which is not what text alternatives are
> about.

For mapping the accessible name to the accessibility API with aria-label
then it is correct: that is what aria-label does. Aria-label does not
provide a function to deliver text alternatives to the end user for *any*
element today, and it could be argued, it shouldn't - you are seeking to
mix functions here, which I don't think is a good design decision.

Per the ARIA spec, aria-label and aria-labeledby serve essentially the
same function (with a preference of using aria-labeledby). Applying it to
some code:

	<h2 id="video1">Silvia's Video Example</h2>
	<video aria-labeledby="video1"></video>

...and

	<video aria-label="Silvia's Video Example"></video>

...are mapped identically to the Accessibility API layer.
http://www.w3.org/TR/2010/WD-wai-aria-20100916/states_and_properties#aria-
label


>
> > The <video> element can take aria-label today with no change to the
> > specification - re-envisioning aria-label to provide alternative text
> > however is incorrect.
>
> Is that really the case? IIUC, neither aria-label nor alt are
> conforming attributes on media elements and screen readers ignore
> them. A test with ChromeVox confirms this, though I cannot test any
> other screenreader on my mac.

Silvia, with all due respect to Charles Chen and the Google folks,
ChromeVox is not a production class screen-reader today, it is a browser
plug-in (based, I am almost positive, on the Fire Vox plug-in for Firefox
- http://www.firevox.clcworld.net/). However, a this time it's true, no
screen readers will process any of the proposed solutions, as they are not
yet specified in any spec out there - we are creating something new here.

******************

Silvia Pfeiffer wrote:
>
> IMO looking at the "poster" as the background image of the play button
> (as though it was added through CSS) makes a lot of sense, since it is
> an interactive element and the image provides the reason why we should
> press the "play" button.

Herein lies the fundemental flaw in your reasoning: the "poster" is
*significantly* more than just some image set via CSS as a background: it
is a richly detailed graphic that plays to your visual senses and
cognitive emotions, thus providing you with "...the reason why we should
press the "play button"...". Your mental model of what we are dealing with
here is incorrect.

If you return to Leonie Watson's statement, she wrote:

	"When I arrive at a video (with my screen reader), I want to know
what that static image/frame contains. At that moment in time, in the
world according to me and my screen reader, that image exists entirely in
its own right. It might be a still from the video, it might be a separate
image. It might be related content, it might be a completely unrelated
corporate ident (for example).

	Wanting to know what that image contains doesn't prevent me from
wanting to know what the video contains. There may well be overlap, but
equally they could be worlds apart."


> In addition, several screenreaders already
> support reading out the aria-label on the video element.  If text-only
> browsers displayed the aria-label text now as well, that would solve
> this particular use case IIUC.

What about when images are disabled in GUI browsers (to speed download or
to save on the cost of bandwidth)? Silvia you are now targeting specific
scenarios and behaviors rather than looking at a larger need and delivery
mechanism.

I have tried to underscore the differences between assigning an accessible
name to an element/object (via either aria-label or aria-labeledby) versus
providing alternative text to a visual element: one tells me what it is,
the other tells me what it looks like. Usually, both are required, however
*if* I am using a dedicated element such as <img>, then the accessible
name (what) is already mapped to the Accessibility API (unless, as in
Victor's case, I want to re-name the element from "video (region)" to
"Yahoo! Video Player", at which point aria-label on the video element will
do that for him). I earlier suggested that aria-label could also be used
to provide an accessible name to the media asset (Clockwork Orange
trailer) which is blurring things a bit (as again it is actually an
attribute of an attribute, but I will stop being pedantic here), but it
should non-the-less not be confused with a textual description of what is
on screen.

Saying that you are Silvia Pfeiffer does not describe how you look.

For non-sighted people who have met you personally, saying that it is a
picture of Silvia Pfeiffer will trigger recollections of who you are for
them, but it still does not tell them that you are (for example) blonde.
In some cases (Many? Most?) that information will not be relevant and so
in code <... alt="Photo of Silvia Pfeiffer"> is sufficient; when we need
to also convey the 'blondeness' we could either go <... alt="Photo of the
blonde Silvia Pfeiffer"> or <... alt="Photo of Silvia Pfeiffer"
longdesc="long_description_of_Silvia_including_mention_of_her_hair_color.h
tml">

Arguing that both are not needed is missing the fundamental point - in any
scenario you care to expose, we need both of these needs met: an
accessible name and an accessible description (and in the case of the
description, we need a mechanism to provide both a short and long form
description, the need to do so while still supporting internationalization
concerns, and the need to convey both text embedded in the image as well
as any appropriate prose describing key visuals in the image).

Arguing that the "still image" that is present when the movie is not
playing does not also meet these needs criteria is false and
unsubstantiated: we have multiple voices and users stating otherwise.

JF
Received on Wednesday, 18 May 2011 16:20:31 UTC