W3C home > Mailing lists > Public > public-html-a11y@w3.org > May 2011

RE: [media] alt technologies for paused video (and using ARIA)

From: John Foliot <jfoliot@stanford.edu>
Date: Tue, 10 May 2011 19:56:59 -0700 (PDT)
To: "'HTML Accessibility Task Force'" <public-html-a11y@w3.org>
Cc: "James Craig" <jcraig@apple.com>, "Michael Cooper" <cooper@w3.org>
Message-ID: <027b01cc0f87$18a26800$49e73800$@edu>
Silvia Pfeiffer wrote:
> Hi all,

Hi Silvia, thanks for bringing this topic to the fore. I have copied the
Chairs of the ARIA WG on this response for their info and possible input
concerning ARIA usage.

> Over the last weeks I've been putting together ideas about what
> requirements we have for alt technologies on videos that are either
> paused by default or not displayed because of text-only displays.
> My current state of mind is that we need to solve three use cases:
> 1. a brief description that will give the casual "tab"-passer-by an
> impression as to what the video is about to help them make a
> play/noplay decision
> 2. longer descriptions that give a bit more detail and describe, e.g.
> the poster and give a summary of the content; this is often text
> already available elsewhere on the page
> 3. a possibility to link a full transcription of the video to the
> video and provide it in the context menu

One potential use case not captured here is the case where we have the
'better' structural navigation we've talked about (but not yet spec'd),
such as 'chapters' and/or sub-chapters that users could skip to - each of
those 'chapter points' could/would have a default 'still' that we should
address as well. We discussed this very briefly at the face-to-face in

> I've concretely suggested to introduce the following attributes on
> <video>:
> 1. To satisfy use case 1: @aria-label

  <video poster="media/ClockworkOrangetrailer.jpg" controls
         aria-label="A Clockwork Orange movie poster">
    <source src="media/ClockworkOrangetrailer.mp4">
    <source src="media/ClockworkOrangetrailer.webm">
    <source src="media/ClockworkOrangetrailer.ogv">

This is a mistaken use of aria-label: this <video> (object) is not a
poster, it is the entire media offering - a multi-media resource that deaf
users, blind users, and deaf/blind users will consume differently based
upon the additional resources that the author provides.

The ARIA specification defines aria-label this way:
	"Defines a string value that labels the __current element__.
(Emphasis mine - JF) See related aria-labelledby.

The purpose of aria-label is the same as that of aria-labelledby. It
provides the user with a recognizable name of the object. The most common
accessibility API mapping for a label is the accessible name property."

In your code example, the element is <video> and the @poster is an
*attribute* of the <video> element (object). I have tried numerous times
to explain this to the sub-team, with apparently no success: attributes
cannot take on additional attributes, this is simply how the mechanics of
HTML works. When a screen reader (for example) announces aloud "A
Clockwork Orange movie poster" it is labeling something completely
different than the movie; it is inappropriate and confusing to suggest
otherwise and contrary to what aria-label has been defined to express.

> 2. To satisfy use case 1: @aria-describedby

<video poster="media/ClockworkOrangetrailer.jpg" controls
aria-describedby="summary more desc"
        aria-label="A Clockwork Orange movie poster">
    <source src="media/ClockworkOrangetrailer.mp4">
    <source src="media/ClockworkOrangetrailer.webm">
    <source src="media/ClockworkOrangetrailer.ogv">

    <p id="summary">
In future Britain, charismatic delinquent Alex DeLarge is jailed and
for an experimental aversion therapy developed by the government in an
to solve society's crime problem... but not all goes to plan.
      <li>Director: Stanley Kubrick</li>
      <li>Writers: Stanley Kubrick (screenplay), Anthony Burgess
      <li>Stars: Malcolm McDowell, Patrick Magee and Warren Clarke</li>
      <li id="more"><a href="http://www.imdb.com/title/tt0066921/">Details
on IMDB</a></li>

With regard to aria-describedby="summary more" I agree, this is good usage
of ARIA and meets the needs of the use-case.  I had previously suggested
that aria-describedby could meet this need:

	"(NOTE: At this time, I believe that adding @alt to the video
element is semantically weak and inappropriate: while I believe it is
important if not critical to provide a textual summation of the actual
video asset for accessibility considerations, attributes such as
@aria-labelledby, @aria-describedby, or (@longdesc*) applied to <video>,
or perhaps <summary> as a child element of <video>, would be more accurate
and useful to non-visual users.)"


As for aria-describedby="desc", Silvia you are being tricked by your eyes
here (sorry).

Perhaps a re-examination of the code, with the poster initially removed
from the mix will help. (This will assume a closed system that only uses
Safari as the browser available.):

<video	<!-- establishes the element -->
		<!-- declares an attribute of the element: @src -->
		<!-- declares an attribute of the element: @controls -->
		<!-- declares an attribute of the element:
@aria-describedby -->

The question now is, with the video object src (attribute) defined as an
.mp4 file, what is its description? (In other words, what are you
describing via aria-describedby?)

Is it:
 "(Summary:) In future Britain, charismatic delinquent Alex DeLarge is
jailed and volunteers for an experimental aversion therapy developed by
the government in an effort to solve society's crime problem... but not
all goes to plan."?

Or is it:
 "...a movie poster with the film's protagonist, Alex (played by Malcolm
McDowell) brandishing a knife while peering through a cutout of a stylized
"A" or inverted "V". An eyeball appears floating at his wrist. The poster
also reads "Being the adventures of a young man whose principle interests
are rape, ultra-violence and Beethoven", as well as bold psychedelic type
below the image which reads "Stanley Kubrick's Clockwork Orange..."?


Revisiting the same code, but this time with the @poster declaration:

<video 	<!-- establishes the element -->
		<!-- declares an attribute of the element: @src -->
		<!-- declares an attribute of the element: @controls -->
		<!-- declares an attribute of the element:
@aria-describedby -->

		<!-- declares an attribute of the element: @poster -->

Here again, what are you describing? The same as the previous example?

But what of the imagery[*] at
http://www.iff2010.com/images/competitions/Film-can_details.png? Why would
the description text referenced by aria-describedby change significantly
simply because the author chooses to also include an author-selected

[* For the benefit of some readers, I will describe the image: referenced
is two film cans, one laying flat, the other standing on its edge, located
just behind the laying can. Both cans are decorated with an image of a
movie camera in the center, and ringed with a series of large black
circles to simulate the look of a movie reel.]

WCAG 2.0 states:
	"Guideline 1.1 Text Alternatives: Provide text alternatives for
any non-text content so that it can be changed into other forms people
need, such as large print, braille, speech, symbols or simpler language."

The .mp4 object is "non-text content".
The .png object is "non-text content".
They are *different* objects - equally related to the <video> elements as
but discrete and unique non-the-less.

This is not a question of whether the image chosen is appropriate or not,
or whether authors should or shouldn't do this, as no matter what we
suggest in authoring guidance, the fact of the matter is that the code
demonstrated would be fully conformant and would render on screen. For
this reason, we are obligated to ensure that both non-text object have a
means to be textually represented. The Guideline is clear: *any* non-text
content requires text alternatives.

> 3. To satisfy use case 1: a new attribute @transcription

This is interesting.

I am curious to know why you wouldn't consider the <track + @kind> pattern
here, as transcripts are essentially the same as captions minus the
time-stamping information. I am not overly concerned here, but more
curious. Is there an advantage of treating the transcript as a different
type of text file than other text files associated to the <video> element?

As well, (at the risk of belaboring a point) an @transcription attribute
to <video> re-enforces my assertion that attributes attached to elements
define properties of the element, and not of other sibling attributes.


> Example 2: video with text

You wrote:
An @aria-label attribute is added with a short description which
captures the core of the displayed video. The server makes sure to serve
text in the language that is in use on the Web page. If that language is
switched, the aria-label text will also switch language.

Screen readers and voice browsers would upon tabbing onto the video
element read
out the aria-label text.

  <video poster="media/acessodigital.png" controls
         aria-label="Web accessibility: cost or benefit">
    <source src="media/acessodigital_en.mp4">
    <source src="media/acessodigital_en.webm">
    <source src="media/acessodigital_en.ogv">

I have concerns here that you are expecting a server environment to be
to detect incoming language preferences (this of course can be done, but
really only present on large international sites), and that somehow this
detection will then re-write the web-page to change the value of the

I know firsthand that here on campus I cannot reasonably expect my IT
to provide this kind of language negotiation on the server(s), especially
the  sheer number of decentralized servers on campus. We require (I
believe) an
author-based solution that addresses internationalization issues. Directly

indicating 'in the code' changes of language benefits the majority of
reader users, as most tools today can change language profiles on the fly.

WCAG 2.0 states:
	"3.1.2 Language of Parts: The human language of each passage or
phrase in
the content can be programmatically determined except for proper names,
terms, words of indeterminate language, and words or phrases that have
become part
of the vernacular of the immediately surrounding text. (Level AA)"

...while the associated Techniques for WCAG 2.0 states:
	"H58: Using language attributes to identify changes in the human

The objective of this technique is to clearly identify any changes in
language on
a page by using the lang or xml:lang attribute, as appropriate for the
XHTML version you use."

In the example provided, the initial key frame offers text in three
despite the fact that the source language of the document is clearly (as
well as
programmatically) indicated as English:
	<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"

However, every sighted user accessing the page can clearly see that the
text is offered in 3 languages, and so we are obligated to convey the same

information to non-sighted users as well. What *should* be conveyed to
reader users is essentially this:

	<span lang="pt-br">Acessibilidade Web: Custo ou Benefício?</span>
Accessibility: Cost or Benefit <span lang="es">Accissibildad Web: Costo ou


...and not 1/3 of that.

Returning to the definition of aria-label and aria-labeledby, the
Recommendation states that this attribute "Defines a string value that
the current element."

While I am not 100% certain, I believe that aria attributes that take
values' are like @alt, in that they cannot take additional block level or
elements (as this would I believe create a nesting error - something
forbidden today for @alt in HTML5), but I will request clarification from
the ARIA
experts here.

So while the second example certainly remains true to the Draft "HTML5:
Techniques for providing useful text alternatives"
(http://www.w3.org/TR/html-alt-techniques/#img-of-text) by focusing on the

embedded text in that image, we need to also be sure that we can support
internationalization at the author level by using the @lang or @lang-xml
attributes. *IF* aria-label can support inline <span>s then this may be an

acceptable possibility (although it still does not provide for:
	" The placeholder images shows fingers on a keyboard titled 'Web
accessibility: cost or benefit' in Spanish, English and Portuguese.")

Meanwhile the code example #3 shows:

<div id="posteralt" style="position:absolute; left:-10000px; width:1px;
height:1px; overflow:hidden;">The placeholder images shows fingers on a
titled 'Web accessibility: cost or benefit' in Spanish, English and

This feels very "hacky" - and is extremely reminiscent of discussions
@longdesc: "hidden content" that may or may not be appropriate for sighted
'discoverability', and other related head-bashing. As well, it returns to
question of what is being described, the (sic) .mp4 or the .png.

While David Singer indicated he was not keen on going there, both Silvia
and now
I am interested in teasing this out further, but note that this rests on
currently outside of our sub-group surrounding the re-opening of Issue 31

> I would like to suggest a discussion of this proposal here on list and
> in the next media subgroup meeting.


Received on Wednesday, 11 May 2011 02:57:29 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:55:56 UTC