Re: Adopting the media accessibility requirements

On Mon, 01 Nov 2010 14:32:49 +0100, Henri Sivonen <hsivonen@iki.fi> wrote:

> When evaluating accessibility, it seem to me that it should be a  
> no-brainer that if failure to satisfy a requirement doesn't deny access  
> to people with a disability, then it's not an accessibility  
> *requirement*. If failure to satisfy the "requirement" made access less  
> smooth to people with a disability while not affecting people without  
> that disability (compared to satisfying the requirement), then the  
> "requirement" is an accessibility-enhancing feature, but not a  
> *requirement*. If the failure to address the "requirement" doesn't  
> materially change the experience of users with any disability any more  
> than it'd change the experience of users without disabilities, then the  
> item is neither an accessibility requirement nor an accessibility  
> feature and shouldn't be represented to be an accessibility feature or  
> requirement.
>
> I suggest evaluating each proposes requirement and asking the question:  
> "If this requirement were removed, people with which disability would be  
> denied access?" If the answer is "None", don't call it a requirement and  
> ask the follow-up question: "If this requirement were removed, people  
> with which disability would have the convenience of their user  
> experience degraded significantly more than users without disabilities?"  
> If the answer is "None", get rid of the "requirement".
>
> I believe that subjecting "copyright metadata" (or codec optimization)  
> to this two-step litmus test would lead to the conclusion of taking it  
> off the requirement list.

This isn't a popularity contest, but +1 to the above. Actually, it should  
be a three-step litmus test, as the last question would be (paraphrasing  
you) "Does this requirement materially change the experience of users with  
any disability any more than it'd change the experience of users without  
disabilities?" If not, then it's not an accessibility requirement even  
though the answer isn't necessarily "None" to the first two questions.

Enough theory, I went through the section on captions [1] to give some  
more detailed feedback:

(CC-1) Render text in a time-synchronized manner, using the media resource  
as the timebase master.
(CC-2) Allow the author to specify erasures, i.e., times when no text is  
displayed on the screen (no text cues are active).
(CC-3) Allow the author to assign timestamps so that one caption/subtitle  
follows another, with no perceivable gap in between.
(CC-7) Display multiple rows of text when rendered as text in a  
right-to-left or left-to-right language.
(CC-8) Allow the author to specify line breaks.
(CC-9) Permit a range of font faces and sizes.
(CC-11) Render text in a range of colors.
(CC-13) Where a background is used, it is preferable to keep the caption  
background visible even in times where no text is displayed, such that it  
minimises distraction. However, where captions are infrequent the  
background should be allowed to disappear to enable the user to see as  
much of the underlying video as possible.
(CC-16) Use conventions that include inserting left-to-right and  
right-to-left segments within a vertical run (e.g. Tate-chu-yoko in  
Japanese), when rendered as text in a top-to-bottom oriented language.
(CC-17) Represent content of different natural languages. In some cases  
the inclusion of a few foreign words form part of the original soundtrack,  
and thus need to be in the same caption resource. Also allow for separate  
caption files for different languages and on-the-fly switching between  
them. This is also a requirement for subtitles.
(CC-18) Represent content of at least those specific natural languages  
that may be represented with [Unicode 3.2], including common typographical  
conventions of that language (e.g., through the use of furigana and other  
forms of ruby text).
(CC-19) Present the full range of typographical glyphs, layout and  
punctuation marks normally associated with the natural language's  
print-writing system.
(CC-20) Permit in-line mark-up for foreign words or phrases.
(CC-22) Support captions that are provided inside media resources as  
tracks, or in external files.
(CC-23) Ascertain that captions are displayed in sync with the media  
resource.
(CC-24) Support user activation/deactivation of caption tracks.
(CC-25) Support edited and verbatim captions, if available.
(CC-26) Support multiple tracks of foreign-language subtitles in different  
languages.

Remove these, because failure to do any of these would be a nuisance to  
everyone, not just users with disabilities. Several of these are clearly  
internationalization issues, not really accessibility issues. If user  
groups need any particular method for discovery/activation/deactivation of  
caption tracks, it would be great if the document spelled it out. (Also,  
CC-23 is a dupe of CC-1.)

(CC-5) Support positioning in all parts of the screen - either inside the  
media viewport but also possibly in a determined space next to the media  
viewport. This is particularly important when multiple captions are on  
screen at the same time and relate to different speakers, or when  
in-picture text is avoided.

This requirement is unclear to me and not written with the web in mind. It  
sounds like the requirement is that it be possible to have captions and  
additional video tracks rendered outside of the video element rather than  
on top of it. However, I doubt that this is a hard requirement since it's  
not possible at all on TV or DVD, except in the bars that you sometimes  
get on widescreen content. Thus, it would be an accessibility enhancing  
feature, or more likely a feature that is useful to anyone using  
subtitles/captions. (For what it's worth, I think we could do this by  
directing the rendering of captions to a separate element, which would  
probably have to be a block-level element.)

(CC-14) Allow the use of mixed display styles-- e.g., mixing paint-on  
captions with pop-on captions-- within a single caption cue or in the  
caption stream as a whole. Pop-on captions are usually one or two lines of  
captions that appear on screen and remain visible for one to several  
seconds before they disappear. Paint-on captions are individual characters  
that are "painted on" from left to right, not popped onto the screen all  
at once, and usually are verbatim. Another often-used caption style in  
live captioning is roll-up - here, cue text follows double chevrons  
("greater than" symbols), and are used to indicate different speaker  
identifications. Each sentence "rolls up" to about three lines. The top  
line of the three disappears as a new bottom line is added, allowing the  
continuous rolling up of new lines of captions.

This "requirement" gives examples of different styles, but doesn't  
actually require anything specific. It should probably be merged with  
"(CC-27) Support live-captioning functionality." and state clearly what is  
actually required.

(CC-21) Permit the distinction between different speakers.

This is something I'd really like to see explained further. Would  
appending "Speaker: " to all cues address this, or is it actually a  
requirement to have semantic markup of speakers like TTML does? Which  
users miss out if the speaker isn't semantically marked up? I could guess  
the answer, but would like the document to spell it out for me :)

Purging the document of non-accessibility and non-user requirements would  
reduce the number of requirements dramatically. Perhaps the requirements  
that are removed can be scanned for things that aren't already supported  
in HTML5 and put on a TODO-list to follow up on after the  
accessibility-related issues have been dealt with to everyone's  
satisfaction.

[1]  
http://www.w3.org/WAI/PF/HTML/wiki/Media_Accessibility_Requirements#Captioning

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Monday, 1 November 2010 16:26:36 UTC