- From: Greg Lowney <gcl-0039@access-research.org>
- Date: Tue, 13 Jul 2010 12:45:15 -0800
- To: WAI-UA list <w3c-wai-ua@w3.org>
- Message-ID: <4C3CD05B.3040501@access-research.org>
As requested, here are some comments on the Media Accessibility Requirements document, for those working on it. Most are minor editorial issues but there are a few significant ones. Comments on Media Accessibility Requirements http://www.w3.org/WAI/PF/HTML/wiki/Media_Accessibility_Requirements Last modified 2010-06-16 (I've used asterisk to mark comments that seem wide in scope, rather than being merely about one particular item.) General 1. I noticed that in two places the document skips heading levels, and that the navigation links at the bottom are headings, which doesn't seem appropriate. (The Firefox extension HeadingsMap highlights these discrepancies.) 2. Link to definitions for screen reader, AT, etc? 3. * Making things available to AT is explicitly required in a very few instances (e.g. Transcripts), which doesn't seem an intentional choice. The two sections devoted to AT compatibility also call out a few requirements, making it all rather confusing. I suggest making it a more general requirement applying to almost everything (e.g. AT access to caption text and its formatting, hyperlinks, etc.). 4. * In general it doesn't distinguish expected steps (such as keyboard access and customizable color) from steps that would be going above and beyond core expectations (such as most of the steps listed for Autism). This could really mislead and turn off readers who interpret these as unrealistic expectations for most media. Are all listed requirements really deserving of core-level status? 5. * In general it could benefit from forward references to related sections further down in the document. Accessible media requirements by type of disability 6. Re dyslexia mention synchronized highlighting of phrases in text with audio. 7. Why capitalize "Communication, Social Interaction, and Repetitive Behaviors"? 8. Incomplete sentence: "Since individuals on the autism spectrum can be quite visual and learn effectively from social stories." 9. In "Dexterity / Mobility impairment" should note that many users rely on AT such as on-screen keyboards or speech recognition. 10. In "Accessible media requirements by type of disability" I'd add a section explaining that many people have multiple disabilities, and that while deaf-blind is one category, there are others not specifically called out here. Users for example may have low vision and difficulty typing, 11. The term "Sensory disability users" isn't used very much, and might be considered less politically correct than "users with sensory disabilities". Audio Description: Voiced, Texted, and Extended 12. * Consider grouping AD, TAD, and EAD in a single section on audio descriptions, because they have a lot of overlap, are closely related, and having three headings disproportionately emphasizes them over other technologies that get a single section. Same with captioning and extended captioning, and the two sections on AT compatibility. 13. "They are written to convey objective information (e.g., a yellow flower) rather than subjective judgments (e.g., a beautiful flower)" may be correct but seems odd to me. I'm sure the script called for a beautiful flower, rather than merely a yellow one, and the beauty is what the writer and director were trying to convey, so it seems strange to actively avoid conveying it. 14. These two bullet items seem redundant: "Closed descriptions can be recorded as a separate track containing descriptions only, timed to play at specific spots in the timeline and played in parallel with the program-audio track."; "Some audio descriptions can be given as a separate audio channel mixed in at the player." Are "track" and "channel" used here as technical terms for different things, or is it just a linguistic choice? 15. "Audio description is available...; however regulation in the U.S. and Europe is increasingly focusing on description..." I think you mean, "and" rather than "however". 16. The term "audio/video descriptions" seems misleading as (to me at least) it sounds like it's discussing both audio descriptions of visual content (e.g. a second audio track) and visual descriptions of visual content (e.g. displayed text). 17. Re list introductions like "Systems supporting audio/video descriptions that are not open must", do you nowhere say that systems that provide audio are required to support audio/video descriptions? 18. "(AD-1) Provide an indication that descriptions are available, and are active/non-active." seems useful but not necessarily a core requirement. I believe that most television viewers who try closed captions are used to just turning them on and waiting to see whether any captions are actually displayed, which is actually more convenient than requesting a display telling them whether there are captions and only then turning on the caption display. 19. "The degree and speed of volume change should be under provider control" what is meant by provider in this case? The term hasn't been used in the discussion thus far. 20. "(AD-8) Allow the author to provide fade and pan controls to be accurately synchronized with the original soundtrack." is not really enough information for novices like me. You might want to elaborate on the goal. Is it to have the description sound like the narrator is standing in the same location as the object being described? Also, one difference between AD-8 and AD-13 is the former is all about author control, whereas the latter gives control to both, but still fails to specify that the user preference should override author preference. 21. Is there supposed to be another document or section that would go into more details on these requirements? Quite a few of them seem too high-level to be useful; for example, "(TAD-1) Support presentation of texted audio description through a screen-reader or braille device with playback speed control and voice control and synchronization points with the video." 22. "(AD-10) Allow the user to select from among different languages of descriptions, if available, even if they are different from the language of the main soundtrack." I'd add "or from the general system language setting.", for example choosing audio descriptions in your native Farzi even if you're using English for your operating system's primary language and listening to a film with Japanese audio. 23. I suggest adding something early on letting readers know that additional, advanced features are discussed in separate sections below. For example, when I first read this I noted that it lacked allowing the audio description track (speech or text) to pause and resume the media with which it's synchronized. (For example, for video with audio being watched when all viewers want the descriptions, the user might choose a descriptive track that pauses the normal content in order to insert more detailed descriptions than could fit in the main content's normal gaps.) I wrote a comment about it, only later to find it was included in a separate section. 24. Shouldn't the audio description requirements (or recommendations) include the user ability to omit the video altogether, leaving only the normal audio and descriptions? 25. "Texted audio descriptions are provided as text files with a start time for a description cue." It would help to mention any standardized formats used for this purpose. 26. Compare and contrast "(TAD-3) Where possible, support to present a text or separate audio track privately to those that need it in a mixed-viewing situation, e.g. through headphones." vs. "(AD-11) Support the simultaneous playback of both the described and non-described audio tracks so that one may be directed at separate outputs (e.g., a speaker and headphones)." The key differences aren't conveyed clearly. 27. "(TAD-4) Where possible, support for different options for authors & users to deal with the overflow case: continue reading, stop reading, and pause the video. Pause the primary audio and video. The preferred solution from a user POV is to pause the video and finish reading out the TAD." Consider rephrasing as "pause the /primary audio and video/ until the TAD catches up." 28. In the discussion of texted audio description, might want to clarify that every time you say "video" you of course mean both the primary video and audio content. 29. Reading top to bottom, I kept thinking that the document overlooked variations until I encountered them further down. I would recommend that the introduction to audio descriptions mention that subsequent sections will discuss basic audio descriptions, texted audio descriptions, and extended audio descriptions. Similarly, the discussions of AD and TAD might allude to the fact that sometimes the descriptions are too long for pauses, and refer the reader to the section on extended audio descriptions below. 30. EAD-2 (automatically pausing) would be impractical without EAD-3 (automatically resuming), so you might just combine them. 31. TAD-4 and EAD-1 blur the boundary between TAD and EAD. If a system supports TAD-4 it supports EAD, so might take out TAD-4 and refer the reader to the EAD section. 32. EAD section might explicitly say it applies to both AD and TAD. Clear Audio (CA) 33. "(CA-4) Potentially support pre-emphasis filers" I think you meant "filters". Content Navigation by Content Structure (CN) 34. "Short music selections tend to have versus and repeating choruses" I think you meant "verses". 35. In the section on structured navigation, your discussion of h1 isn't what I would have expected. In HTML documents, h1 is normally the title of the current document, regardless of the scope of that document. For example, an online book would typically be divided into multiple pages and the h1 for the main page would be the title of the book, while the h1 for a chapter would be the title of the chapter, and if you can delve more deeply and reach a page for a section its h1 would be the title of that section. Thus, where you say "In a news broadcast, the most global level (analogous to <h1>) might be 'News, Weather, and Sports.'" I would have expected the h1 equivalent would more like "KIRO 7 Eyewitness News at 5PM". 36. "Audio productions of 'The Divine Comedy' may well include reproductions of famous frescoes or paintings interspersed throughout the text", did you mean video or multimedia productions? I don't expect many audio productions to reproduce the frescoes and paintings :-) 37. "Nowadays, these programs are based on the ANSI/NISO Z39.86 specifications." You might say "ANSI/NISO Z39.86 (DAISY) specifications" in order to reference its commonly-used friendly name. 38. In the introduction to structured navigation, the final two paragraphs (UAAG references) seem entirely out of place. 39. In some places the document interleaves requirements for authoring tools (e.g. CN-1) with requirements for content players (e.g. CN-2), which is a little confusing. 40. I think I could figure out what "transport bar" means, but then two paragraphs later "navigation track" comes along and I'm not sure what the difference would be. 41. Shouldn't structural navigation requirements include providing the user with a navigable table of contents? 42. "(CN-1) Generally, provide accessible keyboard controls for navigating a media resource in lieu of clicking on the transport bar need to be available, e.g. 5sec forward/back, 30sec forward/back, beginning, end" is in the h3 section titled "Content Navigation by Content Structure" but isn't about navigating by structure, nor does it fit in the larger h2 section "Alternative Content Technologies". 43. If you were going to include CN-1 saying that content navigation controls need to be keyboard accessible, that would imply that all sections discussing user input needs to have a similar requirement for keyboard access. Seems better just to refer readers to the section on keyboard access which requires it for /everything/, and perhaps provide a non-exhaustive list of instances you think they might overlook. 44. * Seems odd that there are a lot of things here that I don't believe are in UAAG. For example, CN-9 requires the user be able to skip or filter out ancillary content such as sidebars, but I don't believe UAAG20 requires that Web browsers allow the user to exclude such things from the keyboard navigation or voicing order. Captioning (CC) 45. "Captions are always written in the same language as the main audio track." And yet, I've not seen DVD or set-top boxes distinguish between same-language and different-language captions. Also, you should discuss here the use of foreign language captions, rather than only mentioning them in the lead-in sentence for the requirements. Also, CC-26 explicitly acknowledges that there can be be multiple tracks of captions in different languages. 46. "Closed captions are transmitted as data along with the video...", wouldn't the category of closed captions also include captions that are pulled down only on demand, possibly from another source entirely, rather than transmitted with the video? Or is there another term for that? 47. "...turn them on, usually by invoking an on-screen control or menu selection" or a dedicated physical button such as on a remote control. 48. "Open captions are always visible; they have been merged with the video track and cannot be turned off." Except by selecting a different video track. 49. Interesting to note that while users of closed captions may prefer verbatim text, operas are usually supertitled using shortened versions of the libretto, to make it easier for readers to follow along without spending too much time reading each line. This is true even of same-language supertitles. 50. As noted above it's confusing to first mention subtitles and foreign language subtitles in the lead-in to the requirements, without introducing the concepts or clarifying that they'd use the same technologies as same language captions. 51. * "(CC-10) Render a background in a range of colors, supporting a full range of opacities." With this and several similar requirements, do you want to clarify that the caption author should be able to specify a background color, or do you feel it would be acceptable for the player to choose what it considers a background appropriate for the text color and video background? Should the user be able to override caption attributes such as these? 52. There are several requirements for horizontal languages without corresponding requirements for vertical languages. For example, should CC-15 or a parallel equivalent require that captions can be positioned at least a minimum distance from the side of the screen? 53. "(CC-21) Permit the distinction between different speakers." An example of one that requires more detail. For example, any system would allow one to prefix strings with the name of the speaker, and you already require the author to be able to put strings in different locations. Do you mean markup so the rendering agent can apply automatic, distinct formatting styles, or so that assistive technology examining the captions can convey the distinctions to users through other means? 54. The lists titled "Formats for captions, subtitles or foreign-language subtitles must" and "Further, systems that support captions must" should probably use parallel construction, as I assume they both relate to all types of captions, including same language and foreign language, and regardless of whether they're formatted as subtitles or otherwise. 55. A number of items in the list "Formats for captions, subtitles or foreign-language subtitles must" seem to be discussing the systems that display the captions rather than the formats for specifying them. It may just be a matter of rewording a number of the items, such as changing "(CC-1) Render text in a time-synchronized manner, using the audio track as the timebase master." To "(CC-1) Allow the author to specify the time and duration at which text is displayed, using the audio track as the timebase master." and "(CC-11) Render text in a range of colors." to "(CC-11) Allow the author to specify colors for ranges of text." 56. The list titled "Further, systems that support captions must" should probably include one or more requirements to support the wide range of author-specified markup that caption formats are required to support. For example, having a caption format that allows the author to specify text color is wasted when a player ignores those settings. 57. Why is captioning the only section to distinguish requirements for data formats from requirements for rendering systems? Wouldn't that distinction apply just as much (or little) for audio description, sign language, etc.? Extended Captioning 58. It might be helpful to give an example of how this could be used. For example, an ancillary window could display a scrolling list of the most recent hyperlinks to be provided in captions, so that the link doesn't disappear after just a few seconds when the next set of captions is displayed. Sign Translation 59. "mixed with the video and offered as an entirely alternate stream" should be in parentheses instead of commas. 60. "(SL-3) Support the display of sign language video either as picture-in-picture or alpha-blended overlay..." in these clauses the use of "or" leaves it ambiguous whether the system needs to support both methods and allow the author to choose, or whether the system is allowed to support only one of the options. Transcripts 61. I believe "Providing a full transcript is a good option in addition to, but not as a replacement for, timed captioning" conflicts with UAAG20 where we acknowledge situations where transcripts are more appropriate than synchronized captions. For example, transcripts are usually sufficient for pre-recorded audio-only media. 62. "A transcript can either be presented simultaneously with the media material, which can assist slower readers or those who need more time to reference context, but it should also be made available independently of the media." has inconsistent grammar: probably want to delete "either". 63. I would suggest avoiding the word "provisioning" because it's jargon and there are other terms that are more widely understood. Also, it's not used elsewhere in the document. System Requirements 64. This section could use an intro paragraph. I assume it's a catch-all for requirements that don't fit into a single alternative content technology, all of which were in the previous section. However, the term "system requirements" parallels that used under "Captioning" where it meant requirements for players as distinct from data formats, and that's confusing, especially since other sections such as that on assistive technology are certainly system requirements. Any catch-all section should probably be at the end rather than in the middle. Keyboard Access to interactive controls / menus 65. As noted above, it should be made clear that access through keyboards and keyboard emulators is not optional, despite the phrase "Systems supporting keyboard accessibility must..." 66. The phrase "interactive controls / menus" in the title is misleading, since it is not limited to things that are "interactive" as in having input and output, and "controls/menus" implies things with visual representation on the screen. For example, if a player supports navigation using mouse gestures, those should also all have keyboard equivalents. Granularity Level Control for Structural Navigation 67. "(CNS-3) This control must be input device agnostic." We don't talk about agnostic elsewhere, so might rephrase it. Since functionality needs to be available through the keyboard (already required by KA-1) this essentially says that all keyboard navigation commands need to also have equivalents for every other input device (e.g. pointing devices, and on some systems speech or gestures). Is that what you intended to require? 68. Isn't this entire section redundant to the content navigation section? Time Scale Modification 69. This is the first list of requirements that isn't scoped with "Systems supporting such and so must". Does it really rate being the only universal requirement? Production practice and resulting requirements No comments. Discovery and activation/deactivation of available alternative content by the user 70. Re "The user agent /can/ facilitate the discovery of alternative content by following the criteria", this is the first list of requirements to be described as optional, with the word "can" instead of "must". 71. Most of these requirements are already covered in their appropriate sections of the document. 72. "(DAC-3) The user can browse the alternatives, switch between them." Should be replaced by the newer UAAG wording. Requirements on making properties available to the accessibility interface 73. Should refer the reader to the section on assistive technology API further down in the document, or better yet, come after it. 74. "any media controls need to be connected to that API" should be "any media controls and text content need to be..." 75. "On self-contained products that do not support assistive technology, any menus in the content need to provide information in alternative formats" I'm skeptical of seeming to limit this to menus when it really means menus and other controls. 76. "make accessibility controls, such as the closed-caption toggle, as prominent as the volume or channel controls" As I commented on the 508 Refresh, while this is well intentioned, a quick review of remote controls for televisions and set-top boxes indicated that most if not all give special prominent for volume and channel controls, because they're probably the most commonly used controls. I don't think that it is necessary for dedicated caption and video description controls to be equal in prominence, and thus tied for the most prominent controls on the device. This is especially true in because many people who use captions will turn them on and leave them on, rather than toggling them frequently. 77. The sentences on remote controls and physical button layout don't really fit the section title ("making properties available to the accessibility interface"). 78. Technically, the closed-system requirements don't fit the title either, but at least they thematically go with AT compatibility so the title could be changed to better incorporate both. 79. I really can't understand any of the requirements in this section as they're currently written. For example, while "(API-1) Support to expose the alternative content tracks for a media resource to the user, i.e. to the browser" is clear with regard to alt text /when displayed or hidden/, and captions /as they're displayed/, what does it mean with regard to secondary audio tracks? Requirements on the use of the viewport 80. Re "(VP-1) If alternative content has a different height or width to the media content, then the user agent will reflow the viewport." This seems more relevant to the container than to media per se. Is it talking about when video is replaced by description or when a caption field is added below the video field? 81. If we're talking about containers hosting media, then it brings in a few additional requirements not yet listed here, such as the ability to move the keyboard focus into and out of media objects. 82. "(VP-5) Captions occupy traditionally the lower-third of the video - the use of this area for other controls or content needs to be avoided." This is one of the few "requirements" that is phrased as a recommendation. Requirements on the parallel use of alternate content on potentially multiple devices in parallel 83. The requirements in this section are all about supporting assistive technology, which doesn't fit with the title or introduction to this section ("Requirements on the parallel use of alternate content on potentially multiple devices in parallel"). The title and intro should be changed.
Received on Tuesday, 13 July 2010 19:45:47 UTC