Re: Meaning of audio track kind 'descriptions' from David Singer on 2011-06-20 (public-html-a11y@w3.org from June 2011)

From: David Singer <singer@apple.com>
Date: Mon, 20 Jun 2011 14:47:12 +0200
To: Mark Watson <watsonm@netflix.com>
Cc: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, Bob Lund <B.Lund@cablelabs.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-id: <A8AC3460-4E5A-43A4-965E-501D3260754A@apple.com>

I'd like to return to a general question of the client software and the user. It seems we should enable the client software to expose the choices to the user (or match them against user preferences), and enable the client software to enable or disable the right tracks to get that effect -- ideally, without having special-purpose decisions based on the actual adaptation. So if, for example, we introduce a new accessibility adaptation in future, old clients that don't recognize the keyword can still ask the user "do you want/need X?" and still 'do the right thing' with the content (in terms of enable/disable actions).

So, here are two simple ways to achieve this. I am sure there are others. Both of these allow multiple tags in a 'kind' label.

A) In the set of tags for a given track's kind, have either "+" or "-" before each tag. "+X" means 'enable this track if the user wants X' and "-X" means 'disable this track if the user wants X. Disables over-ride enables, that is, if a label says "+X -Y" and you want both X and Y, you disable the track.

In both cases I think the labelling can be complete enough that the initial state is irrelevant - the labels and algorithm give a clear outcome for every track - but perhaps it could be said that the initial state has "main" tracks enabled and everything else disabled.

[examples below]

B) In the set of tags for a given track, say that the tag set "alternate X" means 'disable the main content of the same media type if you want X, and enable this track' and "X" means just 'enable this track if you want X'.

Examples:

1) text captions as an add-on text track
A) the text track has kind="+captions" (or whatever the word is)
B) the text track has kind="captions"

2) burned-in captions in an alternative video track
A) the main video has kind="+main -captions" and the alternative video with captions has kind="+captions"
B) the main video has kind="main" and the alternative video has kind="alternative captions"

3) audio description as an add-on to the main audio - just like example 1.

4) audio description as a replacement to the main audio - just like example 2.

5) clean audio as an alternative to the main audio - just like example 2.

6) clean audio, where the main audio is delivered in two tracks - the dialog and the background music separately - and the background music is disabled for the user needing clean audio:
A) the two tracks say kind="main" and kind="main -cleanaudio"
B) ... I don't see how to express this.

7) Repetitive stimulus avoidance as an alternative - just like example 2.

8) Repetitive stimulus avoidance as an overlay (e.g. a black square in front of the flashing light) - just like example 1.

I really don't like the case where you have to recognize that a kind of "Q", when wanted, implicitly means disable the main content of the same media type, whereas "R" doesn't.

Nor am I crazy about this implicit matching over media type - there are people (especially in asia) that use alpha-coded images (aka a video track) to deliver captions on occasion, for example.

Why is it hard to come up with a simple scheme to enable the client software to get out of the business of being *required* to understand the labels?

Multimedia and Software Standards, Apple Inc.

Received on Monday, 20 June 2011 12:48:37 UTC