W3C home > Mailing lists > Public > public-html@w3.org > November 2010

Re: Categorization of media a11y requirements

From: Maciej Stachowiak <mjs@apple.com>
Date: Fri, 05 Nov 2010 10:27:28 +0100
Cc: Geoff Freed <geoff_freed@wgbh.org>, Frank Olivier <Frank.Olivier@microsoft.com>, "public-html@w3.org" <public-html@w3.org>, "public-html-a11y@w3.org" <public-html-a11y@w3.org>
Message-id: <99573A06-06A1-42B8-9A37-70F01DFE26E0@apple.com>
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>

We don't necessarily need super short all-caps tags. Here are some possible names in plain language:

HTML5 Spec Requirement
HTML5 Spec Requirement (already addressed)
Timed Text Format Requirement
UA Guidelines Requirement


On Nov 5, 2010, at 3:11 AM, Silvia Pfeiffer wrote:

> I actually used TTEXT (notice the 'E') and I first thought about TEXT,
> but it needed something with time in it. How about CUE?
> Silvia.
> On Fri, Nov 5, 2010 at 12:56 PM, Geoff Freed <geoff_freed@wgbh.org> wrote:
>> Hi, Frank and Silvia:
>> I actually suggest that you *not* call it TTXT because, ironically, TTXT is
>> already in use as the name of another timed-text format:
>>  http://gpac.sourceforge.net/doc_ttxt.php#ttxt .  Maybe call it TRACKDETAIL?
>> Geoff/NCAM
>> On 11/4/10 5:19 PM, "Silvia Pfeiffer" <silviapfeiffer1@gmail.com> wrote:
>> Maybe this classification is more useful in the Checklist [1] than the
>> one we have there currently for "types of technologies affected"?
>> I would, however, suggest calling the Text track format details TTEXT
>> and not TRACK, since track is now the keyword for the markup in HTML5.
>> [1] http://www.w3.org/WAI/PF/HTML/wiki/Media_Accessibility_Checklist
>> Cheers,
>> Silvia.
>> On Fri, Nov 5, 2010 at 3:55 AM, Frank Olivier
>> <Frank.Olivier@microsoft.com> wrote:
>>> Re http://www.w3.org/2010/11/04-html-wg-minutes.html
>>> Members of HTML WG and media a11y reviewed the
>>> http://www.w3.org/WAI/PF/HTML/wiki/Media_Accessibility_Requirements
>>> requirements
>>> We sorted the requirements into the following categories:
>>> *       UX: User agent user experience requirement
>>> *       SPECNEW: New requirements for the HTML5 specification
>>> *       SPECCED: Already in the HTML5 specification
>>> *       TRACK: Text track format detail - something that should be specced
>>> in the track format, not the HTML spec
>>> *       NO: Not an issue we will address in the W3C
>>> Of the 119 requirements in the document:
>>> SPECNEW 11 items (9% of total)
>>> SPECCED 21 (18%)
>>> TRACK           24 (20%)
>>> NO              4 (3%)
>>> UX              73 (61%)
>>> Detailed list:
>>> NO: (DV-9) Allow the author to use a codec which is optimized for voice
>>> only, rather than requiring the same codec as the original soundtrack.Does
>>> not seem like a UA issue:
>>> NO: (KA-3) The author would be able to choose any/all of the controls,
>>> skin them and position them. Needs discussion' kill
>>> NO: (KA-5) The scripted and native controls must go through the same
>>> platform-level accessibility framework (where it exists), so that a user
>>> presented with the scripted version is not shut out from some expected
>>> behavior.
>>> NO: (PP-1) This necessitates a clear and unambiguous declared format, so
>>> that existing authoring tools can be configured to export finished files in
>>> the required format.
>>> SPECCED, UX: (CA-2) Support the synchronisation of multitrack audio either
>>> within the same file or from separate files - preferably both.
>>> SPECCED: (API-1) The existence of alternative-content tracks for a media
>>> resource must be exposed to the user agent.
>>> SPECCED: (API-2) Since authors will need access to the alternative content
>>> tracks, the structure needs to be exposed to authors as well, which requires
>>> a dynamic interface.
>>> SPECCED: (API-3) Accessibility APIs need to gain access to alternative
>>> content tracks no matter whether those content tracks come from within a
>>> resource or are combined through markup on the page.
>>> SPECCED: (CA-1) Support clear audio as a separate, alternative audio track
>>> from other audio-based alternative media resources.
>>> SPECCED: (CC-22) Support captions that are provided inside media resources
>>> as tracks, or in external files.
>>> SPECCED: (CC-26) Support multiple tracks of foreign-language subtitles in
>>> different languages.
>>> SPECCED: (CC-27) Support live-captioning functionality. Addressed via API
>>> SPECCED: (CN-4) Support third-party provided structural navigation markup.
>>> SPECCED: (CNS-4) Producers and authors may optionally provide additional
>>> access options to identified structures, such as direct access to any node
>>> in a table of contents. [May be done with cue range]
>>> SPECCED: (DAC-4) Synchronized alternatives for time-based media (e.g.,
>>> captions, descriptions, sign language) can be rendered at the same time as
>>> their associated audio tracks and visual tracks (UAAG 2.0 3.1.3).
>>> SPECCED: (DV-3) Support multiple description tracks (e.g., discrete tracks
>>> containing different levels of detail).
>>> SPECCED: (DV-4) Support recordings of real human speech as a track of the
>>> media resource, or as an external file.
>>> SPECCED: (KA-3) All functionality available to native controls must also
>>> be available to scripted controls.
>>> SPECCED: (PP-1) Support existing production practice for alternative
>>> content resources, in particular allow for the association of separate
>>> alternative content resources to media resources. Browsers cannot support
>>> all forms of time-stamp formats out there, just as they cannot support all
>>> forms of image formats (etc.).
>>> SPECCED: (PP-4) Typically, alternative content resources are created by
>>> different entities to the ones that create the media content. They may even
>>> be in different countries and not be allowed to re-publish the other one's
>>> content. It is important to be able to host these resources separately,
>>> associate them together through the Web page author, and eventually play
>>> them back synchronously to the user.
>>> SPECCED: (SL-4) Support multiple sign-language tracks in several sign
>>> languages.
>>> SPECCED: (T-1) Support the provisioning of a full text transcript for the
>>> media asset in a separate but linked resource. where the linkage is
>>> programmatically accessible to AT.
>>> SPECNEW, SPECCED: (SL-1) Support sign-language video either as a track as
>>> part of a media resource or as an external file.
>>> SPECNEW, SPECCED: (SL-2) Support the synchronized playback of the
>>> sign-language video with the media resource.
>>> SPECNEW, TRACK: (CC-5) Support positioning in all parts of the screen -
>>> either inside the media viewport but also possibly in a determined space
>>> next to the media viewport. This is particularly important when multiple
>>> captions are on screen at the same time and relate to different speakers, or
>>> when in-picture text is avoided.
>>> SPECNEW, TRACK: (CN-1) Provide a means to structure media resources so
>>> that users can navigate them by semantic content structure.
>>> SPECNEW, UX: (CC-25) Support edited and verbatim captions, if available.
>>> SPECNEW, UX: (DV-8) Allow the author to provide fade and pan controls to
>>> be accurately synchronized with the original soundtrack.
>>> SPECNEW: (CN-3) Support both global navigation by the larger structural
>>> elements of a media work, and also the most localized atomic structures of
>>> that work, even though authors may not have marked-up all levels of
>>> navigational granularity.
>>> SPECNEW: (CN-6) Support direct access to any structural element, possibly
>>> through URIs. [Media fragment-like issue]
>>> SPECNEW: (CNS-1) All identified structures, including ancillary content as
>>> defined in "Content Navigation" above, must be accessible with the use of
>>> "next" and "previous," as refined by the granularity control. [May be
>>> handled with cue ranges]
>>> SPECNEW: (DAC-2) The user has a global option to specify which types of
>>> alternative content by default and, in cases where the alternative content
>>> has different dimensions than the original content, how the layout/reflow of
>>> the document should be handled. (UAAG 2.0 3.1.2). [Probably minimal spec
>>> text required: Media queries would work nicely here; also UX issue (user
>>> sets media query to match)]
>>> SPECNEW: (DAC-5) Non-synchronized alternatives (e.g., short text
>>> alternatives, long descriptions) can be rendered as replacements for the
>>> original rendered content (UAAG 2.0 3.1.3).
>>> TRACK, UX: (CC-16) Use conventions that include inserting left-to-right
>>> and right-to-left segments within a vertical run (e.g. Tate-chu-yoko in
>>> Japanese), when rendered as text in a top-to-bottom oriented language.
>>> TRACK, UX: (CC-19) Present the full range of typographical glyphs, layout
>>> and punctuation marks normally associated with the natural language's
>>> print-writing system.
>>> TRACK, UX: (CC-21) Permit the distinction between different speakers.
>>> TRACK, UX: (ECC-2) Support hyperlinks and other activation mechanisms for
>>> supplementary data for (sections of) caption text.
>>> TRACK, UX: (ECC-3) Support text cues that may be longer than the time
>>> available until the next text cue and thus provide overlapping text cues.
>>> TRACK, UX: (ECC-4) It needs to be possible to define timed text cues that
>>> are allowed to overlap with each other in time and be present on screen at
>>> the same time
>>> TRACK: (CC-10) Render a background in a range of colors, supporting a full
>>> range of opacities.
>>> TRACK: (CC-11) Render text in a range of colors.
>>> TRACK: (CC-14) Allow the use of mixed display styles-- e.g., mixing
>>> paint-on captions with pop-on captions-- within a single caption cue or in
>>> the caption stream as a whole.
>>> TRACK: (CC-17.1) Represent content of different natural languages. In some
>>> cases the inclusion of a few foreign words form part of the original
>>> soundtrack, and thus need to be in the same caption resource.
>>> TRACK: (CC-18) Represent content of at least those specific natural
>>> languages that may be represented with [Unicode 3.2], including common
>>> typographical conventions of that language (e.g., through the use of
>>> furigana and other forms of ruby text).
>>> TRACK: (CC-2) Allow the author to specify erasures, i.e., times when no
>>> text is displayed on the screen (no text cues are active).
>>> TRACK: (CC-20) Permit in-line mark-up for foreign words or phrases.
>>> TRACK: (CC-3) Allow the author to assign timestamps so that one
>>> caption/subtitle follows another, with no perceivable gap in between.
>>> TRACK: (CC-4) Be available in a text encoding.
>>> TRACK: (CC-8) Allow the author to specify line breaks.
>>> TRACK: (CC-9) Permit a range of font faces and sizes.
>>> TRACK: (CN-2) The navigation track should provide for hierarchical
>>> structures with titles for the sections.
>>> TRACK: (DV-14) Support metadata, such as copyright information, usage
>>> rights, language, etc.
>>> TRACK: (ECC-1) Support metadata markup for (sections of) timed text cues.
>>> TRACK: (PP-2) Support the association of authoring and rights metadata
>>> with alternative content resources, including copyright and usage
>>> information. [Move to ATAG?]
>>> TRACK: (PP-3) Support the simple replacement of alternative content
>>> resources even after publishing.
>>> UX, SPECCED: (MD-5) If the user can modify the state or value of a piece
>>> of content through the user interface (e.g., by checking a box or editing a
>>> text area), the same degree of write access is available programmatically
>>> (UAAG 2.0 2.1.5).
>>> UX: (CA-3) Support separate volume control of the different audio tracks.
>>> UX: (CC-1) Render text in a time-synchronized manner, using the media
>>> resource as the timebase master.
>>> UX: (CC-12) Enable rendering of text with a thicker outline or a drop
>>> shadow to allow for better contrast with the background.
>>> UX: (CC-13) Where a background is used, it is preferable to keep the
>>> caption background visible even in times where no text is displayed, such
>>> that it minimises distraction. However, where captions are infrequent the
>>> background should be allowed to disappear to enable the user to see as much
>>> of the underlying video as possible.
>>> UX: (CC-15) Support positioning such that the lowest line of captions
>>> appears at least 1/12 of the total screen height above the bottom of the
>>> screen, when rendered as text in a right-to-left or left-to-right language
>>> UX: (CC-17.2) Also allow for separate caption files for different
>>> languages and on-the-fly switching between them. This is also a requirement
>>> for subtitles.
>>> UX: (CC-23) Ascertain that captions are displayed in sync with the media
>>> resource.
>>> UX: (CC-24) Support user activation/deactivation of caption tracks.
>>> UX: (CC-6) Support the display of multiple regions of text simultaneously.
>>> UX: (CC-7) Display multiple rows of text when rendered as text in a
>>> right-to-left or left-to-right language.
>>> UX: (CN-10) Support that in bilingual texts both the original and
>>> translated texts can appear on screen, with both the original and translated
>>> text highlighted, line by line, in sync with the audio narration.
>>> UX: (CN-5) Keep all content representations in sync, so that moving to any
>>> particular structural element in media content also moves to the
>>> corresponding point in all provided alternate media representations
>>> (captions, described video, transcripts, etc) associated with that work.
>>> UX: (CN-7) Support pausing primary content traversal to provide access to
>>> such ancillary content in line.
>>> UX: (CN-8) Support skipping of ancillary content in order to not interrupt
>>> content flow.
>>> UX: (CN-9) Support access to each ancillary content item, including with
>>> "next" and "previous" controls, apart from accessing the primary content of
>>> the title.
>>> UX: (CNS-2) Users must be able to discover, skip, play-in-line, or
>>> directly access ancillary content structures.
>>> UX: (CNS-3) Users need to be able to access the granularity control using
>>> any input mode, e.g. keyboard, speech, pointer, etc.
>>> UX: (DAC-1) The user has the ability to have indicators rendered along
>>> with rendered elements that have alternative content (e.g., visual icons
>>> rendered in proximity of content which has short text alternatives, long
>>> descriptions, or captions). In cases where the alternative content has
>>> different dimensions than the original content, the user has the option to
>>> specify how the layout/reflow of the document should be handled. (UAAG 2.0
>>> 3.1.1).
>>> UX: (DAC-3) The user can browse the alternatives and switch between them.
>>> UX: (DAC-6) Provide the user with the global option to configure a cascade
>>> of types of alternatives to render by default, in case a preferred
>>> alternative content type is unavailable.
>>> UX: (DAC-7) During time-based media playback, the user can determine which
>>> tracks are available and select or deselect tracks. These selections may
>>> override global default settings for captions, descriptions, etc. (UAAG 2.0
>>> 4.9.8)
>>> UX: (DAC-8) Provide the user with the option to load time-based media
>>> content such that the first frame is displayed (if video), but the content
>>> is not played until explicit user request.
>>> UX: (DV-1) Provide an indication that descriptions are available, and are
>>> active/non-active.
>>> UX: (DV-10) Allow the user to select from among different languages of
>>> descriptions, if available, even if they are different from the language of
>>> the main soundtrack.
>>> UX: (DV-11) Support the simultaneous playback of both the described and
>>> non-described audio tracks so that one may be directed at separate outputs
>>> (e.g., a speaker and headphones).
>>> UX: (DV-12) Provide a means to prevent descriptions from carrying over
>>> from one program or channel when the user switches to a different program or
>>> channel.
>>> UX: (DV-13) Allow the user to relocate the description track within the
>>> audio field, with the user setting overriding the author setting. The
>>> setting should be re-adjustable as the media plays.
>>> UX: (DV-2) Render descriptions in a time-synchronized manner, using the
>>> media resource as the timebase master.
>>> UX: (DV-6) Allow the user to independently adjust the volumes of the audio
>>> description and original soundtracks, with the user's settings overriding
>>> the author's.
>>> UX: (DV-7) Permit smooth changes in volume rather than stepped changes.
>>> The degree and speed of volume change should be under provider control.
>>> UX: (ECC-5) Allow users to define the reading speed and thus define how
>>> long each text cue requires, and whether media playback needs to pause
>>> sometimes to let them catch up on their reading.
>>> UX: (EVD-1) Support detailed user control as specified in (TVD-4) for
>>> extended video descriptions.
>>> UX: (EVD-2) Support automatically pausing the video and main audio tracks
>>> in order to play a lengthy description.
>>> UX: (EVD-3) Support resuming playback of video and main audio tracks when
>>> the description is finished.
>>> UX: (KA-1) Support operation of all functionality via the keyboard on
>>> systems where a keyboard is (or can be) present (Needs better text), and
>>> where a unique focus object is employed. This does not forbid and should not
>>> discourage providing mouse input or other input methods in addition to
>>> keyboard operation. (UAAG 2.0 4.1.1)
>>> UX: (KA-2) Support a rich set of native controls for media operation,
>>> including but not limited to play, pause, stop, jump to beginning, jump to
>>> end, scale player size
>>> UX: (KA-4) It must always be possible to enable native controls regardless
>>> of the author preference to guarantee that such functionality is available
>>> UX: (MD-2) Ensure accessibility of all user-interface components including
>>> the user interface, rendered content, and alternative content; make
>>> available the name, role, state, value, and description via a
>>> platform-accessibility architecture. (UAAG 2.0 2.1.2)
>>> UX: (MD-3) If a feature is not supported by the accessibility
>>> architecture(s), provide an equivalent feature that does support the
>>> accessibility architecture(s). Document the equivalent feature in the
>>> conformance claim. (UAAG 2.0 2.1.3)
>>> UX: (MD-4) If the user agent implements one or more DOMs, they must be
>>> made programmatically available to assistive technologies. (UAAG 2.0 2.1.4)
>>> This assumes the video element will write to the DOM.
>>> UX: (MD-6) If any of the following properties are supported by the
>>> accessibility-platform architecture, make the properties available to the
>>> accessibility-platform architecture
>>> UX: (MD-7) Ensure that programmatic exchanges between APIs proceed at a
>>> rate such that users do not perceive a delay. (UAAG 2.0 2.1.7).
>>> UX: (SL-3) Support the display of sign-language video either as
>>> picture-in-picture or alpha-blended overlay, as parallel video, or as the
>>> main video with the original video as picture-in-picture or alpha-blended
>>> overlay.
>>> UX: (SL-5) Support the interactive activation/deactivation of a
>>> sign-language track by the user.
>>> UX: (T-2) Support the provisioning of both scrolling and static display of
>>> a full text transcript with the media resource, e.g. in a area next to the
>>> video or underneath the video, which is also AT accessible.
>>> UX: (TSM-1) The user can adjust the playback rate of the time-based media
>>> tracks to between 50% and 250% of real time.
>>> UX: (TSM-2) Speech whose playback rate has been adjusted by the user
>>> maintains pitch in order to limit degradation of the speech quality.
>>> UX: (TSM-3) All provided alternative media tracks remain synchronized
>>> across this required range of playback rates.
>>> UX: (TSM-4) The user agent provides a function that resets the playback
>>> rate to normal (100%).
>>> UX: (TSM-5) The user can stop, pause, and resume rendered audio and
>>> animation content (including video and animated images) that last three or
>>> more seconds at their default playback rate.
>>> UX: (TVD-1) Support presentation of text video descriptions through a
>>> screen reader or braille device, with playback speed control and voice
>>> control and synchronisation points with the video.
>>> UX: (TVD-2) TVDs need to be provided in a format that contains the
>>> following information: (A) start time, text per description cue (the
>>> duration is determined dynamically, though an end time could provide a cut
>>> point)
>>> UX: (TVD-3) Where possible, provide a text or separate audio track
>>> privately to those that need it in a mixed-viewing situation, e.g., through
>>> headphones.
>>> UX: (TVD-4) Where possible, provide options for authors and users to deal
>>> with the overflow case: continue reading, stop reading, and pause the video.
>>> (One solution from a user's point of view may be to pause the video and
>>> finish reading the TVD, for example.) User preference should override
>>> authored option.
>>> UX: (TVD-5) Support the control over speech-synthesis playback speed,
>>> volume and voice, and provide synchronisation points with the video.
>>> UX: (VP-1) It must be possible to deal with three different cases for the
>>> relation between the viewport size, the position of media and of alternative
>>> content:
>>> UX: (VP-2) The user can change the following characteristics of visually
>>> rendered text content, overriding those specified by the author or
>>> user-agent defaults
>>> UX: (VP-3) Provide the user with the ability to adjust the size of the
>>> time-based media up to the full height or width of the containing viewport,
>>> with the ability to preserve aspect ratio and to adjust the size of the
>>> playback viewport to avoid cropping, within the scaling limitations imposed
>>> by the media itself.
>>> UX: (VP-4) Provide the user with the ability to control the contrast and
>>> brightness of the content within the playback viewport.
>>> UX: (VP-5) Captions and subtitles traditionally occupy the lower third of
>>> the video, where also controls are also usually rendered.
>>> UX: [In that this is a user agent issue] (MD-1) Support a
>>> platform-accessibility architecture relevant to the operating environment.
>>> (UAAG 2.0 2.1.1)
>>> UX: (CA-4) Support pre-emphasis filters, pitch-shifting, and other
>>> audio-processing algorithms.
>>> UX: (DV-5) Allow the author to independently adjust the volumes of the
>>> audio description and original soundtracks. [Actually a requirement on the
>>> media format]
Received on Friday, 5 November 2010 09:28:08 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:45:27 UTC