W3C home > Mailing lists > Public > public-html-a11y@w3.org > December 2010

Re: [media] handling multitrack audio / video

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Sat, 4 Dec 2010 07:31:47 +1100
Message-ID: <AANLkTimVCTuzPyxepC3FPkdw=q-FJwvS69hA2iaq4vpP@mail.gmail.com>
To: Maciej Stachowiak <mjs@apple.com>
Cc: Eric Carlson <eric.carlson@apple.com>, Geoff Freed <geoff_freed@wgbh.org>, HTML Accessibility Task Force <public-html-a11y@w3.org>, Frank Olivier <Frank.Olivier@microsoft.com>
On Sat, Dec 4, 2010 at 5:01 AM, Maciej Stachowiak <mjs@apple.com> wrote:
> I like the idea of expanding the "kind" attribute. I came up with the same idea yesterday while discussing the pros and cons of media queries with Eric. I think media queries are a poor fit for what we need to do to describe alternatives with built-in accessibility affordances. Media queries are designed to describe the rendering context so that content can adapt. In this case, we want to describe the media resource so that the UA can choose the best version according to user preferences. kind (particularly with a variant that can distinguish optional vs. burned-in affordances, and which can list multiple affordances) seems like a much more natural fit.
> I don't think the <switch> element really helps here, because the challenge is that we need best-fit, not first-fit behavior, and <switch> doesn't help with that at all relative to <source> selection.
> My suggestion for auxiliary media resources is to let the <track> element take an id instead of a src which points to an <audio> or <video>. That plus a kind label on the track can provide the needed linkage without complicated changes to content models.

This would also mean that where we link a separate <video> element in,
such as for sign language, the positioning of that video element is
done in the usual CSS way and we don't have to work out how to figure
out how to position something like a <videotrack> subelement inside
<video> or a  massive <switch>-like statement as before.

I think I like that idea.

Just to be a bit more concrete, let me try to make an example (using
language codes from

<video id="main" controls poster="video.png">
  <source src="v_main.ogg">
  <source src="v_main.webm">
  <source src="v_main.mp4">
  <track label="Chapters" kind="chapters" srclang="en" src="v_chapters_en.wsrt">
  <track label="Kapitel" kind="chapters" srclang="de" src="v_chapters_de.wsrt">
  <track label="Subtitles (en)" kind="subtitles" srclang="en"
  <track label="Untertitle (de)" kind="subtitles" srclang="de"
  <track label="Descriptions" kind="descriptions" srclang="en"
  <track label="Beschreibungen" kind="descriptions" srclang="de"
  <track label="Signlanguage" kind="signing" srclang="ase" id="#signing">
  <track label="Gebaerdensprache" kind="signing" srclang="gsg" id="#gebaerden">

<video id="signing">
  <source src="v_sign-ase.mp4"  type="video/mp4">
  <source src="v_sign-ase.webm" type="video/webm">
  <source src="v_sign-ase.ogv"  type="video/ogg">

<video id="gebaerden">
  <source src="v_sing-gsg.mp4"  type="video/mp4">
  <source src="v_sign-gsg.webm" type="video/webm">
  <source src="v_sign-gsg.ogv"  type="video/ogg">

<audio id="descriptions">
  <source src="a_desc-en.mp3"  type="audio/mp3">
  <source src="a_desc-en.ogg" type="audio/ogg">

<audio id="beschreibungen">
  <source src="a_desc-de.mp3"  type="audio/mp3">
  <source src="a_desc-de.ogg" type="audio/ogg">

We'd probably need to make the separate <video> elements display:none
unless user preferences or user menu selection dictates to activate
them (audio without @controls is not displayed anyway). Maybe the sign
language videos all need to go into a single fixed size div so when
they get displayed, they don't start re-positioning elements on the
page. And their playback positions need to stay in sync, of course.

Now, this covers the case where we have separate audio and video
resources that contain the auxiliary accessibility content. This is

It will also work for the case where all the auxiliary accessibility
content is inside the main audio/video, i.e. in-band:

<video id="main" controls poster="video.png">
  <source src="v_main.ogg">
  <source src="v_main.webm">
  <source src="v_main.mp4">
  <track label="Chapters" kind="chapters" srclang="en" src="v_chapters_en.wsrt">
  <track label="Kapitel" kind="chapters" srclang="de" src="v_chapters_de.wsrt">

v_main (in each format) would have:
* main video track
* main audio track
* caption text track in en
* caption text track in de
* subtitle text track in en
* subtitle text track in de
* description audio track in en
* description audio track in de
* signing video track in en
* signing video track in de

I do wonder how to do the rendering / save the screen space for the
sign language tracks in this example.

Anyway, it should be possible to expose both, this example and the
example above, in JavaScript in the same manner, since to the user
they are essentially identical.

Now there's only one other major use case that we haven't considered
yet: the case where we have completely alternative resources to the
main audio/video that contain signing/descriptions/captions/subtitles.
So we have, for example:

* v_main.{ogv,mp4,webm} which has main audio & main video
* v_main_sign.{ogv,mp4,webm} which has main audio & main video & sign
language video & captions - being particularly targeted at the HoH
* v_main_desc.{ogv,mp4,webm} which has main audio & main video & audio
description - being particularly targeted at the VI
* plus all of this in different languages

I think this is also a major use case that we will find, in particular
where the auxiliary accessibility content has been burnt-in.

I almost think that in this case we have to provide completely
alternative <video> elements where only one of them is allowed to be
active. This is where a manifest may come in handy.

Received on Friday, 3 December 2010 20:32:41 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:05:16 UTC