RE: Meaning of audio track kind 'descriptions'

> -----Original Message-----
> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
> Sent: Thursday, June 16, 2011 5:18 PM
> To: Bob Lund; Silvia Pfeiffer; Mark Watson; HTML Accessibility Task
> Force
> Subject: Re: Meaning of audio track kind 'descriptions'
> 
> I agree with the way that Janina describes the situation and the future.
> 
> I also understand Bob's current situation of having to deal with set-top
> boxes and TVs.

It's a fair point that legacy set-top boxes and TVs might not be the primary target for future browser-based clients. The important underlying issue, from what I hear, is that content owners would prefer not to re-author content, already being delivered to legacy devices, for browser-based clients.

> 
> Bob, I further wonder: when you get the described audio as mixed-in, do
> you get it as a separate audio track or is it actually a completely
> different video file? So do you deal with one video file (with original
> video track + original audio track) and one audio file (with mixed audio
> + descriptions) or do you deal with two video files (one with the
> original audio track and one with the mixed one)?

The video in question is delivered as MPEG-2 multi-program transport streams, rather than file based. The MPEG-2 MPTS has multiple programs, where each might have 1 video stream, 1 main dialogue audio track and a secondary audio track consisting of the main dialogue + audio description. MPEG-2 TS can also carry MPEG-4 elementary media streams. This same multiplexing structure (1 program with multiple audio tracks) can also be replicated in the MPEG-4 base file format and used with adaptive delivery.

Bob

> 
> Silvia.
> 
> On Fri, Jun 17, 2011 at 5:15 AM, Janina Sajka <janina@rednote.net>
> wrote:
> > Bob is correct, imho. But please note the reason human narrated audio
> > description is pre-mixed with the audio of the primary resource and
> > delivered in a single channelis the historical technology that first
> > brought such content to market. You just couldn't do anything else in
> > analog TV, where only the SAP channel was available for this content,
> > because the home premises equipment wasn't designed to do audio
> mixing.
> >
> > This history also condemmed the described video to mono playback.
> >
> > While this model will predominate early on, I'm by no means convinced
> > it's it describes the future. To start with, it would sure be nice to
> > have stereo sound. Oh, and it would be nice to be able to
> > independently adjust the volume of the video descriptions, and even to
> > direct them at specific audio devices while audio from the primary
> > resource is played through different audio devices.
> >
> >
> > Given that many people with disabilities have multiple disabilities
> > and hearing loss together with blindness is by no means uncommon,
> > separating the description track from the primary audio has
> > advantages. And, now we have a technology platform that can deliver
> > it, unlike the 1950's SAP specifications.
> >
> > Janina
> >
> > Bob Lund writes:
> >> The use case today in cable is descriptive video service where the
> description and main dialogue are pre-mixed and delivered as a single
> channel. The single channel is preferred because the majority of video
> receivers (set-top-boxes and TVs) do not have the capability to mix
> audio. There are emerging regulatory requirements to provide descriptive
> video service (audio descriptions) so in the short term the single
> channel approach might be more prevalent. Related to this is a desire by
> content owners to not have to re-author content to deliver it on the
> Web.
> >>
> >> If pre-mixed description and dialogue is identified as @kind =
> alternative then there will need to be a way to distinguish it from
> other types of alternative audio. @label could be used but then it will
> be desirable to have some definition of @label string semantics so
> clients know how to interpret them.
> >>
> >> Bob
> >>
> >> > -----Original Message-----
> >> > From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com]
> >> > Sent: Wednesday, June 15, 2011 10:56 PM
> >> > To: Mark Watson
> >> > Cc: Bob Lund; HTML Accessibility Task Force
> >> > Subject: Re: Meaning of audio track kind 'descriptions'
> >> >
> >> > Note that I haven't yet seen a use case that absolutely requires us
> >> > to know if a track is additional or alternative. If we do, we can
> >> > always use a data-* attribute for this right now. If we see the
> >> > data-* attribute being required to solve use cases, then we can ask
> >> > for the introduction of an additional marker.
> >> >
> >> > Bob: what was your use case?
> >> >
> >> > Cheers,
> >> > Silvia.
> >> >
> >> > On Thu, Jun 16, 2011 at 2:53 PM, Silvia Pfeiffer
> >> > <silviapfeiffer1@gmail.com> wrote:
> >> > > On Thu, Jun 16, 2011 at 1:07 PM, Mark Watson
> >> > > <watsonm@netflix.com>
> >> > wrote:
> >> > >> I had a different understanding.
> >> > >>
> >> > >> We keep coming back to these cases where we can imagine both
> >> > "alternative" and "additional" tracks as solutions to some problem.
> >> > >>
> >> > >> I've argued at length before that it doesn't work to have a
> >> > >> blanket
> >> > mechanism whereby any track can be labeled as either "alternative"
> >> > or "additional" - and indeed we have no such mechanism: it's
> >> > implicit in the track kind - you need to understand the kind to
> >> > know whether it is alternative or additional.
> >> > >>
> >> > >> I actually thought that all our audio kinds were alternatives.
> >> > >> I'm no
> >> > expect, but I would guess that it's hard to create a descriptions
> >> > track which can be freely mixed with the original audio.
> >> > >
> >> > >
> >> > > I've done so before. It's not hard at all. You listen to the
> >> > > original track and you speak into the microphone. It is easier to
> >> > > record it in this way because the quality of the original audio
> >> > > doesn't degrade. It is also the way in which for example the
> jwplayer works:
> >> > > http://www.longtailvideo.com/support/addons/audio-description/151
> >> > > 36/au
> >> > > dio-description-reference-guide
> >> > > .
> >> > >
> >> > > It would be bad if you have to mix in the original audio because
> >> > > that both degrades the quality of that track, increases the
> >> > > required bandwidth (because compressed silence is smaller than
> >> > > compressed sound), requires re-recording the original content
> >> > > (which might end up in copyright trouble), and requires switching
> >> > > between tracks rather than just adding and removing a track.
> >> > > Switching between tracks will be a lot more perceptible than
> adding/removing a second track.
> >> > >
> >> > > So, I can only see advantages to having an audio description
> >> > > provided as a separate track.
> >> > >
> >> > >
> >> > >> If both kinds exists (alternative descriptions and additive
> >> > descriptions), then we need two kind values. Given that it's an
> >> > accessibility requirement it would be nice for it to be explicit,
> >> > so I would expect to have two "descriptions" kinds e.g.
> >> > descriptions-add and descriptions-alt.
> >> > >
> >> > > I've only ever seen audio descriptions that come as separate
> tracks.
> >> > > In the TV case you would have had to mix it for transmission
> >> > > because there was only one channel available for transmission,
> >> > > but I believe that is the artificial case. The more natural case
> >> > > is to have them separate.
> >> > >
> >> > >
> >> > > Cheers,
> >> > > Silvia.
> >> > >
> >
> > --
> >
> > Janina Sajka,   Phone:  +1.443.300.2200
> >                sip:janina@asterisk.rednote.net
> >
> > Chair, Open Accessibility       janina@a11y.org Linux
> Foundation
> > http://a11y.org
> >
> > Chair, Protocols & Formats
> > Web Accessibility Initiative    http://www.w3.org/wai/pf World Wide
> > Web Consortium (W3C)
> >
> >

Received on Friday, 17 June 2011 15:05:10 UTC