Re: Meaning of audio track kind 'descriptions' from Mark Watson on 2011-06-20 (public-html-a11y@w3.org from June 2011)

From: Mark Watson <watsonm@netflix.com>
Date: Mon, 20 Jun 2011 01:55:04 -0700
To: Bob Lund <B.Lund@cablelabs.com>, Silvia Pfeiffer <silviapfeiffer1@gmail.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-ID: <17A913E8-BC3F-4661-A178-06DA174CA4CD@netflix.com>
Is it not the case than when authoring audio descriptions you might want to make decisions to attenuate part of the main audio track to make the descriptions more audible ?

That would be a reason other than legacy technical restrictions for creating "alternative" audio descriptions tracks.

Silvia: I didn't understand you comment of not seeing a need to distinguish between alternative and additional tracks: surely I need to know this so that I know whether to enable this track in addition to the main track or instead of the main track.

It seems like we have use-cases for both alternative and additional audio descriptions. I think the track should be explicitly marked as descriptions in both cases, so we can apply user preferences, include it in the right menu etc.

So, how should be distinguish the two cases ? We could have two kind values or use data-*as Silvia suggested (Silvia, could you explain how that works ?).

...Mark



On Jun 17, 2011, at 5:04 PM, Bob Lund wrote:

> 
> 
> 
>> -----Original Message-----
>> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
>> Sent: Thursday, June 16, 2011 5:18 PM
>> To: Bob Lund; Silvia Pfeiffer; Mark Watson; HTML Accessibility Task
>> Force
>> Subject: Re: Meaning of audio track kind 'descriptions'
>> 
>> I agree with the way that Janina describes the situation and the future.
>> 
>> I also understand Bob's current situation of having to deal with set-top
>> boxes and TVs.
> 
> It's a fair point that legacy set-top boxes and TVs might not be the primary target for future browser-based clients. The important underlying issue, from what I hear, is that content owners would prefer not to re-author content, already being delivered to legacy devices, for browser-based clients.
> 
>> 
>> Bob, I further wonder: when you get the described audio as mixed-in, do
>> you get it as a separate audio track or is it actually a completely
>> different video file? So do you deal with one video file (with original
>> video track + original audio track) and one audio file (with mixed audio
>> + descriptions) or do you deal with two video files (one with the
>> original audio track and one with the mixed one)?
> 
> The video in question is delivered as MPEG-2 multi-program transport streams, rather than file based. The MPEG-2 MPTS has multiple programs, where each might have 1 video stream, 1 main dialogue audio track and a secondary audio track consisting of the main dialogue + audio description. MPEG-2 TS can also carry MPEG-4 elementary media streams. This same multiplexing structure (1 program with multiple audio tracks) can also be replicated in the MPEG-4 base file format and used with adaptive delivery.
> 
> Bob
> 
>> 
>> Silvia.
>> 
>> On Fri, Jun 17, 2011 at 5:15 AM, Janina Sajka <janina@rednote.net>
>> wrote:
>>> Bob is correct, imho. But please note the reason human narrated audio
>>> description is pre-mixed with the audio of the primary resource and
>>> delivered in a single channelis the historical technology that first
>>> brought such content to market. You just couldn't do anything else in
>>> analog TV, where only the SAP channel was available for this content,
>>> because the home premises equipment wasn't designed to do audio
>> mixing.
>>> 
>>> This history also condemmed the described video to mono playback.
>>> 
>>> While this model will predominate early on, I'm by no means convinced
>>> it's it describes the future. To start with, it would sure be nice to
>>> have stereo sound. Oh, and it would be nice to be able to
>>> independently adjust the volume of the video descriptions, and even to
>>> direct them at specific audio devices while audio from the primary
>>> resource is played through different audio devices.
>>> 
>>> 
>>> Given that many people with disabilities have multiple disabilities
>>> and hearing loss together with blindness is by no means uncommon,
>>> separating the description track from the primary audio has
>>> advantages. And, now we have a technology platform that can deliver
>>> it, unlike the 1950's SAP specifications.
>>> 
>>> Janina
>>> 
>>> Bob Lund writes:
>>>> The use case today in cable is descriptive video service where the
>> description and main dialogue are pre-mixed and delivered as a single
>> channel. The single channel is preferred because the majority of video
>> receivers (set-top-boxes and TVs) do not have the capability to mix
>> audio. There are emerging regulatory requirements to provide descriptive
>> video service (audio descriptions) so in the short term the single
>> channel approach might be more prevalent. Related to this is a desire by
>> content owners to not have to re-author content to deliver it on the
>> Web.
>>>> 
>>>> If pre-mixed description and dialogue is identified as @kind =
>> alternative then there will need to be a way to distinguish it from
>> other types of alternative audio. @label could be used but then it will
>> be desirable to have some definition of @label string semantics so
>> clients know how to interpret them.
>>>> 
>>>> Bob
>>>> 
>>>>> -----Original Message-----
>>>>> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
>>>>> Sent: Wednesday, June 15, 2011 10:56 PM
>>>>> To: Mark Watson
>>>>> Cc: Bob Lund; HTML Accessibility Task Force
>>>>> Subject: Re: Meaning of audio track kind 'descriptions'
>>>>> 
>>>>> Note that I haven't yet seen a use case that absolutely requires us
>>>>> to know if a track is additional or alternative. If we do, we can
>>>>> always use a data-* attribute for this right now. If we see the
>>>>> data-* attribute being required to solve use cases, then we can ask
>>>>> for the introduction of an additional marker.
>>>>> 
>>>>> Bob: what was your use case?
>>>>> 
>>>>> Cheers,
>>>>> Silvia.
>>>>> 
>>>>> On Thu, Jun 16, 2011 at 2:53 PM, Silvia Pfeiffer
>>>>> <silviapfeiffer1@gmail.com> wrote:
>>>>>> On Thu, Jun 16, 2011 at 1:07 PM, Mark Watson
>>>>>> <watsonm@netflix.com>
>>>>> wrote:
>>>>>>> I had a different understanding.
>>>>>>> 
>>>>>>> We keep coming back to these cases where we can imagine both
>>>>> "alternative" and "additional" tracks as solutions to some problem.
>>>>>>> 
>>>>>>> I've argued at length before that it doesn't work to have a
>>>>>>> blanket
>>>>> mechanism whereby any track can be labeled as either "alternative"
>>>>> or "additional" - and indeed we have no such mechanism: it's
>>>>> implicit in the track kind - you need to understand the kind to
>>>>> know whether it is alternative or additional.
>>>>>>> 
>>>>>>> I actually thought that all our audio kinds were alternatives.
>>>>>>> I'm no
>>>>> expect, but I would guess that it's hard to create a descriptions
>>>>> track which can be freely mixed with the original audio.
>>>>>> 
>>>>>> 
>>>>>> I've done so before. It's not hard at all. You listen to the
>>>>>> original track and you speak into the microphone. It is easier to
>>>>>> record it in this way because the quality of the original audio
>>>>>> doesn't degrade. It is also the way in which for example the
>> jwplayer works:
>>>>>> http://www.longtailvideo.com/support/addons/audio-description/151
>>>>>> 36/au
>>>>>> dio-description-reference-guide
>>>>>> .
>>>>>> 
>>>>>> It would be bad if you have to mix in the original audio because
>>>>>> that both degrades the quality of that track, increases the
>>>>>> required bandwidth (because compressed silence is smaller than
>>>>>> compressed sound), requires re-recording the original content
>>>>>> (which might end up in copyright trouble), and requires switching
>>>>>> between tracks rather than just adding and removing a track.
>>>>>> Switching between tracks will be a lot more perceptible than
>> adding/removing a second track.
>>>>>> 
>>>>>> So, I can only see advantages to having an audio description
>>>>>> provided as a separate track.
>>>>>> 
>>>>>> 
>>>>>>> If both kinds exists (alternative descriptions and additive
>>>>> descriptions), then we need two kind values. Given that it's an
>>>>> accessibility requirement it would be nice for it to be explicit,
>>>>> so I would expect to have two "descriptions" kinds e.g.
>>>>> descriptions-add and descriptions-alt.
>>>>>> 
>>>>>> I've only ever seen audio descriptions that come as separate
>> tracks.
>>>>>> In the TV case you would have had to mix it for transmission
>>>>>> because there was only one channel available for transmission,
>>>>>> but I believe that is the artificial case. The more natural case
>>>>>> is to have them separate.
>>>>>> 
>>>>>> 
>>>>>> Cheers,
>>>>>> Silvia.
>>>>>> 
>>> 
>>> --
>>> 
>>> Janina Sajka,   Phone:  +1.443.300.2200
>>>                sip:janina@asterisk.rednote.net
>>> 
>>> Chair, Open Accessibility       janina@a11y.org Linux
>> Foundation
>>> http://a11y.org
>>> 
>>> Chair, Protocols & Formats
>>> Web Accessibility Initiative    http://www.w3.org/wai/pf World Wide
>>> Web Consortium (W3C)
>>> 
>>> 
>
Received on Monday, 20 June 2011 08:55:32 UTC