RE: [media] alt technologies for paused video (and using ARIA) from John Foliot on 2011-05-11 (public-html-a11y@w3.org from May 2011)

From: John Foliot <jfoliot@stanford.edu>
Date: Wed, 11 May 2011 16:37:54 -0700 (PDT)
To: "'Silvia Pfeiffer'" <silviapfeiffer1@gmail.com>
Cc: "'HTML Accessibility Task Force'" <public-html-a11y@w3.org>, "'James Craig'" <jcraig@apple.com>, "'Michael Cooper'" <cooper@w3.org>, "'Jared Smith'" <jared@webaim.org>
Message-ID: <006b01cc1034$730430a0$590c91e0$@edu>
Silvia Pfeiffer wrote:
> 
> It is the recommended way for making content invisible at
> http://webaim.org/techniques/css/invisiblecontent/ and if anyone
> should know the best way it should be WebAIM IMO. I just followed
> their recommendation and I think in this case it is appropriate.

As it happens, Jared Smith is a long-time friend who, for a variety of
reasons not relevant to this discussion is currently not able to
participate directly in the HTML5 WG.

However, I did manage to discuss this with him, and he helpfully responded
with a fairly detailed email, provided in its entirety after this comment.
Specific to this technique however, he wrote:

"...regarding the off-screen content technique
(http://webaim.org/techniques/css/invisiblecontent/) for aria-describedby
elements. Here are several arguments against this as a recommended
solution.:

- While the aria-describedby content would be read when the <video>
element that references it is accessed, it would ALSO be read in it's
actual location within the page, thus introducing potential confusion for
screen reader users reading later in the page. In other words, if I bypass
the video (perhaps navigate by a heading), my screen reader may begin
reading a description for an object that I am unaware of and for which
there is no reverse programmatic association. While this is also an issue
if the content is NOT hidden off-screen, such content would almost
certainly be presented in proximity to or otherwise programmatically
associated to the video itself (maybe within the same heading level or
same <article>). This off-screen technique allows (and in some ways
encourages) the long alternative to be presented anywhere within the page,
thus allowing the alternative to be read to screen reader user twice
and/or in isolation and without association to the video.

- This may encourage the presentation of additional and verbose content
for only one set of users - screen reader users. Much of this information
would be best served if made available to all users (either visibly within
the page or via a separate page), particularly those with cognitive or
other disabilities.

- This solution (nor any of the other presented solutions) allow for
content that necessitates or would be better served on a separate page.
Simply presenting a link within off-screen content is obviously not
suitable. This would limit not only content access, but also functionality
to only screen reader users. It would also violate the intention of
aria-describedby (the video is not described by a anchor element, but by
actual content). Something else (beyond
@transcription) is necessary in this case

- This technique relies on current, yet somewhat arbitrary user agent
behavior. Nothing prescribes that screen readers should read off-screen
content (no more so than they prescribe that they should ignore content
with display:none). Additionally, this results in styling dictating
content functionality - something that both HTML5 and CSS strive to avoid.

For these reasons, an HTML5 technique that recommends or relies on such
styling makes me quite uncomfortable.

**********************

Jared's full response follows my sig.

JF

**********************

John, feel free to pass this on or quote as you'd like.


I noticed that a WebAIM technique was referenced in the mailing list
(http://lists.w3.org/Archives/Public/public-html-a11y/2011May/0291.html).
I wish to provide some thoughts on this technique as well as some broader
thoughts on media alternatives.

The alternative to the video and the alternative to a poster frame will
often not be the same. The editor opinion
(http://lists.w3.org/Archives/Public/public-html/2011Mar/0690.html)
was that "this does not happen in practice" and the dialogue of this
thread generally indicates that video alt and poster alt are always the
same and should not be treated differently. However, I think several cases
have been presented that show this is not always the case. The proposed
solutions themselves
(http://www.w3.org/WAI/PF/HTML/wiki/Media_Alt_Technologies) reference and
describe these separately - strong evidence that they are distinct content
items that must be described distinctly. The examples (both aria-label and
aria-describedby) sometimes present alternatives for the <video>,
sometimes alternatives for the poster frame (tellingly, the referenced
elements have an id value of "posteralt"), and sometimes for both (not to
mention the times they reference other stuff, such as metadata or control
instructions).

This would pose significant confusion for authors. I, as an accessibility
'expert', read the full threads and examined all of the examples and it
still is not clear to me if the alternative being presented by each
attribute should be for the video itself, for the poster frame, for the
controls of the video, or something else? Or somehow all of them combined?

The techniques that have been suggested (with the exception of off-screen
text) seem to work splendidly for alternatives to the video. At first
glance, it seems to me that <video alt> might be a suitable alternative to
@aria-label for short alternatives. For AT and accessibility APIs, they
would be presented and mapped identically.
@aria-describedby would often be an appropriate method when longer,
in-page alternatives are presented.

But if video alt and poster alt are distinct things (as they clearly often
are to me), providing descriptions of their content programmatically via a
single attribute or referenced element (or even multiple elements) seems
fundamentally wrong. If aria-describedby or aria-label is apply to the
<video> element, they are, by definition, labels and descriptions for that
element, not of some other characteristic, attribute, or state of that
element. In many cases additional poster alt would not be necessary
(usually in the case of random frames being presented and often in the
case of first frame being presented). But in the cases where the
alternative content for the poster alt is distinct, a distinct description
mechanism is necessary - perhaps @posteralt (yes, this is structurally
questionable, but no more so than <a hreflang> or <track srclang>, I
suppose) for short alternatives or (optimally) a child <poster> (or
similar) element for short and long description. I'm sure there are other
solutions that don't convolute video alt and poster alt into one thing.

I do like the idea of associating full media descriptions (e.g.,
transcripts) via @transcription (I'd prefer @transcript). Some
wordsmithing on what should go on the @transcription page is necessary. Of
note is that WCAG 2.0 never uses the word "transcript", but uses
"alternative for time-based media"
(http://www.w3.org/TR/WCAG20/#alt-time-based-mediadef) as transcripts are
generally considered to be verbatim text versions of audio content whereas
WCAG requires additional descriptions where necessary.
@description or @longdesc would really be more accurate here, though these
introduce other types of confusion.

Finally, regarding the off-screen content technique
(http://webaim.org/techniques/css/invisiblecontent/) for aria-describedby
elements. Here are several arguments against this as a recommended
solution.:

- While the aria-describedby content would be read when the <video>
element that references it is accessed, it would ALSO be read in it's
actual location within the page, thus introducing potential confusion for
screen reader users reading later in the page. In other words, if I bypass
the video (perhaps navigate by a heading), my screen reader may begin
reading a description for an object that I am unaware of and for which
there is no reverse programmatic association. While this is also an issue
if the content is NOT hidden off-screen, such content would almost
certainly be presented in proximity to or otherwise programmatically
associated to the video itself (maybe within the same heading level or
same <article>). This off-screen technique allows (and in some ways
encourages) the long alternative to be presented anywhere within the page,
thus allowing the alternative to be read to screen reader user twice
and/or in isolation and without association to the video.

- This may encourage the presentation of additional and verbose content
for only one set of users - screen reader users. Much of this information
would be best served if made available to all users (either visibly within
the page or via a separate page), particularly those with cognitive or
other disabilities.

- This solution (nor any of the other presented solutions) allow for
content that necessitates or would be better served on a separate page.
Simply presenting a link within off-screen content is obviously not
suitable. This would limit not only content access, but also functionality
to only screen reader users. It would also violate the intention of
aria-describedby (the video is not described by a anchor element, but by
actual content). Something else (beyond
@transcription) is necessary in this case

- This technique relies on current, yet somewhat arbitrary user agent
behavior. Nothing prescribes that screen readers should read off-screen
content (no more so than they prescribe that they should ignore content
with display:none). Additionally, this results in styling dictating
content functionality - something that both HTML5 and CSS strive to avoid.

For these reasons, an HTML5 technique that recommends or relies on such
styling makes me quite uncomfortable.

Jared Smith
WebAIM.org
Received on Wednesday, 11 May 2011 23:40:48 UTC