W3C home > Mailing lists > Public > w3c-wai-ua@w3.org > July to September 2010

Comments on Media Accessibility Requirements

From: Greg Lowney <gcl-0039@access-research.org>
Date: Tue, 13 Jul 2010 12:45:15 -0800
Message-ID: <4C3CD05B.3040501@access-research.org>
To: WAI-UA list <w3c-wai-ua@w3.org>
As requested, here are some comments on the Media Accessibility 
Requirements document, for those working on it. Most are minor editorial 
issues but there are a few significant ones.

Comments on Media Accessibility Requirements


Last modified 2010-06-16

(I've used asterisk to mark comments that seem wide in scope, rather 
than being merely about one particular item.)


   1. I noticed that in two places the document skips heading levels,
      and that the navigation links at the bottom are headings, which
      doesn't seem appropriate. (The Firefox extension HeadingsMap
      highlights these discrepancies.)
   2. Link to definitions for screen reader, AT, etc?
   3. * Making things available to AT is explicitly required in a very
      few instances (e.g. Transcripts), which doesn't seem an
      intentional choice. The two sections devoted to AT compatibility
      also call out a few requirements, making it all rather confusing.
      I suggest making it a more general requirement applying to almost
      everything (e.g. AT access to caption text and its formatting,
      hyperlinks, etc.).
   4. * In general it doesn't distinguish expected steps (such as
      keyboard access and customizable color) from steps that would be
      going above and beyond core expectations (such as most of the
      steps listed for Autism). This could really mislead and turn off
      readers who interpret these as unrealistic expectations for most
      media. Are all listed requirements really deserving of core-level
   5. * In general it could benefit from forward references to related
      sections further down in the document.

    Accessible media requirements by type of disability

   6. Re dyslexia mention synchronized highlighting of phrases in text
      with audio.
   7. Why capitalize "Communication, Social Interaction, and Repetitive
   8. Incomplete sentence: "Since individuals on the autism spectrum can
      be quite visual and learn effectively from social stories."
   9. In "Dexterity / Mobility impairment" should note that many users
      rely on AT such as on-screen keyboards or speech recognition.
  10. In "Accessible media requirements by type of disability" I'd add a
      section explaining that many people have multiple disabilities,
      and that while deaf-blind is one category, there are others not
      specifically called out here. Users for example may have low
      vision and difficulty typing,
  11. The term "Sensory disability users" isn't used very much, and
      might be considered less politically correct than "users with
      sensory disabilities".

    Audio Description: Voiced, Texted, and Extended

  12. * Consider grouping AD, TAD, and EAD in a single section on audio
      descriptions, because they have a lot of overlap, are closely
      related, and having three headings disproportionately emphasizes
      them over other technologies that get a single section. Same with
      captioning and extended captioning, and the two sections on AT
  13. "They are written to convey objective information (e.g., a yellow
      flower) rather than subjective judgments (e.g., a beautiful
      flower)" may be correct but seems odd to me. I'm sure the script
      called for a beautiful flower, rather than merely a yellow one,
      and the beauty is what the writer and director were trying to
      convey, so it seems strange to actively avoid conveying it.
  14. These two bullet items seem redundant: "Closed descriptions can be
      recorded as a separate track containing descriptions only, timed
      to play at specific spots in the timeline and played in parallel
      with the program-audio track."; "Some audio descriptions can be
      given as a separate audio channel mixed in at the player." Are
      "track" and "channel" used here as technical terms for different
      things, or is it just a linguistic choice?
  15. "Audio description is available...; however regulation in the U.S.
      and Europe is increasingly focusing on description..." I think you
      mean, "and" rather than "however".
  16. The term "audio/video descriptions" seems misleading as (to me at
      least) it sounds like it's discussing both audio descriptions of
      visual content (e.g. a second audio track) and visual descriptions
      of visual content (e.g. displayed text).
  17. Re list introductions like "Systems supporting audio/video
      descriptions that are not open must", do you nowhere say that
      systems that provide audio are required to support audio/video
  18. "(AD-1) Provide an indication that descriptions are available, and
      are active/non-active." seems useful but not necessarily a core
      requirement. I believe that most television viewers who try closed
      captions are used to just turning them on and waiting to see
      whether any captions are actually displayed, which is actually
      more convenient than requesting a display telling them whether
      there are captions and only then turning on the caption display.
  19. "The degree and speed of volume change should be under provider
      control" what is meant by provider in this case? The term hasn't
      been used in the discussion thus far.
  20. "(AD-8) Allow the author to provide fade and pan controls to be
      accurately synchronized with the original soundtrack." is not
      really enough information for novices like me. You might want to
      elaborate on the goal. Is it to have the description sound like
      the narrator is standing in the same location as the object being
      described? Also, one difference between AD-8 and AD-13 is the
      former is all about author control, whereas the latter gives
      control to both, but still fails to specify that the user
      preference should override author preference.
  21. Is there supposed to be another document or section that would go
      into more details on these requirements? Quite a few of them seem
      too high-level to be useful; for example, "(TAD-1) Support
      presentation of texted audio description through a screen-reader
      or braille device with playback speed control and voice control
      and synchronization points with the video."
  22. "(AD-10) Allow the user to select from among different languages
      of descriptions, if available, even if they are different from the
      language of the main soundtrack." I'd add "or from the general
      system language setting.", for example choosing audio descriptions
      in your native Farzi even if you're using English for your
      operating system's primary language and listening to a film with
      Japanese audio.
  23. I suggest adding something early on letting readers know that
      additional, advanced features are discussed in separate sections
      below. For example, when I first read this I noted that it lacked
      allowing the audio description track (speech or text) to pause and
      resume the media with which it's synchronized. (For example, for
      video with audio being watched when all viewers want the
      descriptions, the user might choose a descriptive track that
      pauses the normal content in order to insert more detailed
      descriptions than could fit in the main content's normal gaps.) I
      wrote a comment about it, only later to find it was included in a
      separate section.
  24. Shouldn't the audio description requirements (or recommendations)
      include the user ability to omit the video altogether, leaving
      only the normal audio and descriptions?
  25. "Texted audio descriptions are provided as text files with a start
      time for a description cue." It would help to mention any
      standardized formats used for this purpose.
  26. Compare and contrast "(TAD-3) Where possible, support to present a
      text or separate audio track privately to those that need it in a
      mixed-viewing situation, e.g. through headphones." vs. "(AD-11)
      Support the simultaneous playback of both the described and
      non-described audio tracks so that one may be directed at separate
      outputs (e.g., a speaker and headphones)." The key differences
      aren't conveyed clearly.
  27. "(TAD-4) Where possible, support for different options for authors
      & users to deal with the overflow case: continue reading, stop
      reading, and pause the video. Pause the primary audio and video.
      The preferred solution from a user POV is to pause the video and
      finish reading out the TAD." Consider rephrasing as "pause the
      /primary audio and video/ until the TAD catches up."
  28. In the discussion of texted audio description, might want to
      clarify that every time you say "video" you of course mean both
      the primary video and audio content.
  29. Reading top to bottom, I kept thinking that the document
      overlooked variations until I encountered them further down. I
      would recommend that the introduction to audio descriptions
      mention that subsequent sections will discuss basic audio
      descriptions, texted audio descriptions, and extended audio
      descriptions. Similarly, the discussions of AD and TAD might
      allude to the fact that sometimes the descriptions are too long
      for pauses, and refer the reader to the section on extended audio
      descriptions below.
  30. EAD-2 (automatically pausing) would be impractical without EAD-3
      (automatically resuming), so you might just combine them.
  31. TAD-4 and EAD-1 blur the boundary between TAD and EAD. If a system
      supports TAD-4 it supports EAD, so might take out TAD-4 and refer
      the reader to the EAD section.
  32. EAD section might explicitly say it applies to both AD and TAD.

    Clear Audio (CA)

  33. "(CA-4) Potentially support pre-emphasis filers" I think you meant

    Content Navigation by Content Structure (CN)

  34. "Short music selections tend to have versus and repeating
      choruses" I think you meant "verses".
  35. In the section on structured navigation, your discussion of h1
      isn't what I would have expected. In HTML documents, h1 is
      normally the title of the current document, regardless of the
      scope of that document. For example, an online book would
      typically be divided into multiple pages and the h1 for the main
      page would be the title of the book, while the h1 for a chapter
      would be the title of the chapter, and if you can delve more
      deeply and reach a page for a section its h1 would be the title of
      that section. Thus, where you say "In a news broadcast, the most
      global level (analogous to <h1>) might be 'News, Weather, and
      Sports.'" I would have expected the h1 equivalent would more like
      "KIRO 7 Eyewitness News at 5PM".
  36. "Audio productions of 'The Divine Comedy' may well include
      reproductions of famous frescoes or paintings interspersed
      throughout the text", did you mean video or multimedia
      productions? I don't expect many audio productions to reproduce
      the frescoes and paintings :-)
  37. "Nowadays, these programs are based on the ANSI/NISO Z39.86
      specifications." You might say "ANSI/NISO Z39.86 (DAISY)
      specifications" in order to reference its commonly-used friendly name.
  38. In the introduction to structured navigation, the final two
      paragraphs (UAAG references) seem entirely out of place.
  39. In some places the document interleaves requirements for authoring
      tools (e.g. CN-1) with requirements for content players (e.g.
      CN-2), which is a little confusing.
  40. I think I could figure out what "transport bar" means, but then
      two paragraphs later "navigation track" comes along and I'm not
      sure what the difference would be.
  41. Shouldn't structural navigation requirements include providing the
      user with a navigable table of contents?
  42. "(CN-1) Generally, provide accessible keyboard controls for
      navigating a media resource in lieu of clicking on the transport
      bar need to be available, e.g. 5sec forward/back, 30sec
      forward/back, beginning, end" is in the h3 section titled "Content
      Navigation by Content Structure" but isn't about navigating by
      structure, nor does it fit in the larger h2 section "Alternative
      Content Technologies".
  43. If you were going to include CN-1 saying that content navigation
      controls need to be keyboard accessible, that would imply that all
      sections discussing user input needs to have a similar requirement
      for keyboard access. Seems better just to refer readers to the
      section on keyboard access which requires it for /everything/, and
      perhaps provide a non-exhaustive list of instances you think they
      might overlook.
  44. * Seems odd that there are a lot of things here that I don't
      believe are in UAAG. For example, CN-9 requires the user be able
      to skip or filter out ancillary content such as sidebars, but I
      don't believe UAAG20 requires that Web browsers allow the user to
      exclude such things from the keyboard navigation or voicing order.

    Captioning (CC)

  45. "Captions are always written in the same language as the main
      audio track." And yet, I've not seen DVD or set-top boxes
      distinguish between same-language and different-language captions.
      Also, you should discuss here the use of foreign language
      captions, rather than only mentioning them in the lead-in sentence
      for the requirements. Also, CC-26 explicitly acknowledges that
      there can be be multiple tracks of captions in different languages.
  46. "Closed captions are transmitted as data along with the video...",
      wouldn't the category of closed captions also include captions
      that are pulled down only on demand, possibly from another source
      entirely, rather than transmitted with the video? Or is there
      another term for that?
  47. "...turn them on, usually by invoking an on-screen control or menu
      selection" or a dedicated physical button such as on a remote
  48. "Open captions are always visible; they have been merged with the
      video track and cannot be turned off." Except by selecting a
      different video track.
  49. Interesting to note that while users of closed captions may prefer
      verbatim text, operas are usually supertitled using shortened
      versions of the libretto, to make it easier for readers to follow
      along without spending too much time reading each line. This is
      true even of same-language supertitles.
  50. As noted above it's confusing to first mention subtitles and
      foreign language subtitles in the lead-in to the requirements,
      without introducing the concepts or clarifying that they'd use the
      same technologies as same language captions.
  51. * "(CC-10) Render a background in a range of colors, supporting a
      full range of opacities." With this and several similar
      requirements, do you want to clarify that the caption author
      should be able to specify a background color, or do you feel it
      would be acceptable for the player to choose what it considers a
      background appropriate for the text color and video background?
      Should the user be able to override caption attributes such as these?
  52. There are several requirements for horizontal languages without
      corresponding requirements for vertical languages. For example,
      should CC-15 or a parallel equivalent require that captions can be
      positioned at least a minimum distance from the side of the screen?
  53. "(CC-21) Permit the distinction between different speakers." An
      example of one that requires more detail. For example, any system
      would allow one to prefix strings with the name of the speaker,
      and you already require the author to be able to put strings in
      different locations. Do you mean markup so the rendering agent can
      apply automatic, distinct formatting styles, or so that assistive
      technology examining the captions can convey the distinctions to
      users through other means?
  54. The lists titled "Formats for captions, subtitles or
      foreign-language subtitles must" and "Further, systems that
      support captions must" should probably use parallel construction,
      as I assume they both relate to all types of captions, including
      same language and foreign language, and regardless of whether
      they're formatted as subtitles or otherwise.
  55. A number of items in the list "Formats for captions, subtitles or
      foreign-language subtitles must" seem to be discussing the systems
      that display the captions rather than the formats for specifying
      them. It may just be a matter of rewording a number of the items,
      such as changing "(CC-1) Render text in a time-synchronized
      manner, using the audio track as the timebase master." To "(CC-1)
      Allow the author to specify the time and duration at which text is
      displayed, using the audio track as the timebase master." and
      "(CC-11) Render text in a range of colors." to "(CC-11) Allow the
      author to specify colors for ranges of text."
  56. The list titled "Further, systems that support captions must"
      should probably include one or more requirements to support the
      wide range of author-specified markup that caption formats are
      required to support. For example, having a caption format that
      allows the author to specify text color is wasted when a player
      ignores those settings.
  57. Why is captioning the only section to distinguish requirements for
      data formats from requirements for rendering systems? Wouldn't
      that distinction apply just as much (or little) for audio
      description, sign language, etc.?

    Extended Captioning

  58. It might be helpful to give an example of how this could be used.
      For example, an ancillary window could display a scrolling list of
      the most recent hyperlinks to be provided in captions, so that the
      link doesn't disappear after just a few seconds when the next set
      of captions is displayed.

    Sign Translation

  59. "mixed with the video and offered as an entirely alternate stream"
      should be in parentheses instead of commas.
  60. "(SL-3) Support the display of sign language video either as
      picture-in-picture or alpha-blended overlay..." in these clauses
      the use of "or" leaves it ambiguous whether the system needs to
      support both methods and allow the author to choose, or whether
      the system is allowed to support only one of the options.


  61. I believe "Providing a full transcript is a good option in
      addition to, but not as a replacement for, timed captioning"
      conflicts with UAAG20 where we acknowledge situations where
      transcripts are more appropriate than synchronized captions. For
      example, transcripts are usually sufficient for pre-recorded
      audio-only media.
  62. "A transcript can either be presented simultaneously with the
      media material, which can assist slower readers or those who need
      more time to reference context, but it should also be made
      available independently of the media." has inconsistent grammar:
      probably want to delete "either".
  63. I would suggest avoiding the word "provisioning" because it's
      jargon and there are other terms that are more widely understood.
      Also, it's not used elsewhere in the document.

    System Requirements

  64. This section could use an intro paragraph. I assume it's a
      catch-all for requirements that don't fit into a single
      alternative content technology, all of which were in the previous
      section. However, the term "system requirements" parallels that
      used under "Captioning" where it meant requirements for players as
      distinct from data formats, and that's confusing, especially since
      other sections such as that on assistive technology are certainly
      system requirements. Any catch-all section should probably be at
      the end rather than in the middle.

    Keyboard Access to interactive controls / menus

  65. As noted above, it should be made clear that access through
      keyboards and keyboard emulators is not optional, despite the
      phrase "Systems supporting keyboard accessibility must..."
  66. The phrase "interactive controls / menus" in the title is
      misleading, since it is not limited to things that are
      "interactive" as in having input and output, and "controls/menus"
      implies things with visual representation on the screen. For
      example, if a player supports navigation using mouse gestures,
      those should also all have keyboard equivalents.

    Granularity Level Control for Structural Navigation

  67. "(CNS-3) This control must be input device agnostic." We don't
      talk about agnostic elsewhere, so might rephrase it. Since
      functionality needs to be available through the keyboard (already
      required by KA-1) this essentially says that all keyboard
      navigation commands need to also have equivalents for every other
      input device (e.g. pointing devices, and on some systems speech or
      gestures). Is that what you intended to require?
  68. Isn't this entire section redundant to the content navigation section?

    Time Scale Modification

  69. This is the first list of requirements that isn't scoped with
      "Systems supporting such and so must". Does it really rate being
      the only universal requirement?

    Production practice and resulting requirements

No comments.

    Discovery and activation/deactivation of available alternative
    content by the user

  70. Re "The user agent /can/ facilitate the discovery of alternative
      content by following the criteria", this is the first list of
      requirements to be described as optional, with the word "can"
      instead of "must".
  71. Most of these requirements are already covered in their
      appropriate sections of the document.
  72. "(DAC-3) The user can browse the alternatives, switch between
      them." Should be replaced by the newer UAAG wording.

    Requirements on making properties available to the accessibility

  73. Should refer the reader to the section on assistive technology API
      further down in the document, or better yet, come after it.
  74. "any media controls need to be connected to that API" should be
      "any media controls and text content need to be..."
  75. "On self-contained products that do not support assistive
      technology, any menus in the content need to provide information
      in alternative formats" I'm skeptical of seeming to limit this to
      menus when it really means menus and other controls.
  76. "make accessibility controls, such as the closed-caption toggle,
      as prominent as the volume or channel controls" As I commented on
      the 508 Refresh, while this is well intentioned, a quick review of
      remote controls for televisions and set-top boxes indicated that
      most if not all give special prominent for volume and channel
      controls, because they're probably the most commonly used
      controls. I don't think that it is necessary for dedicated caption
      and video description controls to be equal in prominence, and thus
      tied for the most prominent controls on the device. This is
      especially true in because many people who use captions will turn
      them on and leave them on, rather than toggling them frequently.
  77. The sentences on remote controls and physical button layout don't
      really fit the section title ("making properties available to the
      accessibility interface").
  78. Technically, the closed-system requirements don't fit the title
      either, but at least they thematically go with AT compatibility so
      the title could be changed to better incorporate both.
  79. I really can't understand any of the requirements in this section
      as they're currently written. For example, while "(API-1) Support
      to expose the alternative content tracks for a media resource to
      the user, i.e. to the browser" is clear with regard to alt text
      /when displayed or hidden/, and captions /as they're displayed/,
      what does it mean with regard to secondary audio tracks?

    Requirements on the use of the viewport

  80. Re "(VP-1) If alternative content has a different height or width
      to the media content, then the user agent will reflow the
      viewport." This seems more relevant to the container than to media
      per se. Is it talking about when video is replaced by description
      or when a caption field is added below the video field?
  81. If we're talking about containers hosting media, then it brings in
      a few additional requirements not yet listed here, such as the
      ability to move the keyboard focus into and out of media objects.
  82. "(VP-5) Captions occupy traditionally the lower-third of the video
      - the use of this area for other controls or content needs to be
      avoided." This is one of the few "requirements" that is phrased as
      a recommendation.

    Requirements on the parallel use of alternate content on potentially
    multiple devices in parallel

  83. The requirements in this section are all about supporting
      assistive technology, which doesn't fit with the title or
      introduction to this section ("Requirements on the parallel use of
      alternate content on potentially multiple devices in parallel").
      The title and intro should be changed.
Received on Tuesday, 13 July 2010 19:45:47 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:38:41 UTC