Re: Meaning of audio track kind 'descriptions' from Mark Watson on 2011-06-22 (public-html-a11y@w3.org from June 2011)

From: Mark Watson <watsonm@netflix.com>
Date: Tue, 21 Jun 2011 23:55:19 -0700
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
CC: David Singer <singer@apple.com>, Bob Lund <B.Lund@cablelabs.com>, HTMLAccessibility Task Force <public-html-a11y@w3.org>
Message-ID: <DD7DCB20-8A35-478E-A53A-7C1848F1A4D7@netflix.com>

On Jun 22, 2011, at 8:31 AM, Silvia Pfeiffer wrote:

On 22/06/2011, at 3:45 PM, Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>> wrote:

On Jun 22, 2011, at 2:16 AM, Silvia Pfeiffer wrote:

On Wed, Jun 22, 2011 at 3:32 AM, Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>> wrote:

I think this is a matter of opinion and eventually gets resolved through
feedback from real users. I doubt *users* want to continuously adjust the
relative volume of the tracks, so we're talking about automatic client
capabilities, perhaps based on user preference settings, which do the
ducking. But still these have only one degree of freedom (relative volume of
the two tracks), whereas the content creator has many (relative volume of
the many source tracks).

So I think it's equally unlikely that client-side ducking is always better
than professional mixing.

Well, we are not talking about how a video's main audio track is
composed - of course a professional sound editor will create a better
mix of the many input channels that are necessary to be synchronized.

I am talking about that, though.

We are only talking about how a human or computer-created voice that
is spoken over the top of an existing mix stands out in front of that
mix.

I am also considering other kinds of commentary.

This is a simple matter of turning the main audio track
quieter/louder (i.e. ducking). If I as a user cannot discern the
description voice over the top of the main mix, then I turn the main
mix down. Surely that is always better than a fixed mix of the audio
description with the main audio where I am dependent on how well the
person that does the sound mix can hear the voice in front of the main
audio mix.

I agree that having some kind of user control is a good thing. But I'm still not sure it's "always" better, especially when considering other kinds of commentary. Actually, my point is even weaker than this - just that descriptions provided as an alternative to the main audio should not be considered "legacy" and should be provided for in the spec.

The object is to adjust the main audio so that the descriptions are easily audible. The main audio consists of many components of varying importance to the experience of the content. So someone who is re-mixing those original components can adjust their volume according to their importance, perhaps attenuating some aspects more than others, or removing some completely during the time that the description is spoken.

I think you are over-estimating what is done when an audio description or a commentary is produced. I've never seen such a production that mixes more than exactly two sound sources: the main audio and the spoken overlay. Anything else would also infringe on the copyright of the original production, so is very unlikely to happen.

You don't expect the authors of content to ever provide the descriptions themselves ? Don't the BBC already do this for some of their content ?

The kinds if goals that you mention for emphasizing individual aspects of the original mix are already addressed in the original mix.

But that may change for the case of mixing with descriptions.

But anyway, whilst this argument about mixing is interesting, it's not the main point.

There are clearly two ways in which descriptions could be delivered. We can argue about the relative merits of these two ways, and the HTML a11y WG could even make a recommendation about which is preferably.

But it seems way beyond our scope to say we are *so sure* about the superiority of one approach that the other kind of descriptions can't even be presented to the user in the same way as the "preferred" approach.

If a user asks "why don't the descriptions on this content get enabled according to my preferences, when it works for this other piece of content", it's not an acceptable answer to say "because the engineers in W3C decided not to mark the first kind of descriptions as descritions".

There may also be many other reasons why people take the different approaches to delivering descriptions. I think if someone goes to the trouble of providing them at all then they should be presented to the user that wants them in a consistent way, that's all.

...Mark

Silvia.

Received on Wednesday, 22 June 2011 06:55:42 UTC