- From: Bob Lund <B.Lund@CableLabs.com>
- Date: Fri, 17 Jun 2011 09:04:40 -0600
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, Mark Watson <watsonm@netflix.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
> -----Original Message----- > From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com] > Sent: Thursday, June 16, 2011 5:18 PM > To: Bob Lund; Silvia Pfeiffer; Mark Watson; HTML Accessibility Task > Force > Subject: Re: Meaning of audio track kind 'descriptions' > > I agree with the way that Janina describes the situation and the future. > > I also understand Bob's current situation of having to deal with set-top > boxes and TVs. It's a fair point that legacy set-top boxes and TVs might not be the primary target for future browser-based clients. The important underlying issue, from what I hear, is that content owners would prefer not to re-author content, already being delivered to legacy devices, for browser-based clients. > > Bob, I further wonder: when you get the described audio as mixed-in, do > you get it as a separate audio track or is it actually a completely > different video file? So do you deal with one video file (with original > video track + original audio track) and one audio file (with mixed audio > + descriptions) or do you deal with two video files (one with the > original audio track and one with the mixed one)? The video in question is delivered as MPEG-2 multi-program transport streams, rather than file based. The MPEG-2 MPTS has multiple programs, where each might have 1 video stream, 1 main dialogue audio track and a secondary audio track consisting of the main dialogue + audio description. MPEG-2 TS can also carry MPEG-4 elementary media streams. This same multiplexing structure (1 program with multiple audio tracks) can also be replicated in the MPEG-4 base file format and used with adaptive delivery. Bob > > Silvia. > > On Fri, Jun 17, 2011 at 5:15 AM, Janina Sajka <janina@rednote.net> > wrote: > > Bob is correct, imho. But please note the reason human narrated audio > > description is pre-mixed with the audio of the primary resource and > > delivered in a single channelis the historical technology that first > > brought such content to market. You just couldn't do anything else in > > analog TV, where only the SAP channel was available for this content, > > because the home premises equipment wasn't designed to do audio > mixing. > > > > This history also condemmed the described video to mono playback. > > > > While this model will predominate early on, I'm by no means convinced > > it's it describes the future. To start with, it would sure be nice to > > have stereo sound. Oh, and it would be nice to be able to > > independently adjust the volume of the video descriptions, and even to > > direct them at specific audio devices while audio from the primary > > resource is played through different audio devices. > > > > > > Given that many people with disabilities have multiple disabilities > > and hearing loss together with blindness is by no means uncommon, > > separating the description track from the primary audio has > > advantages. And, now we have a technology platform that can deliver > > it, unlike the 1950's SAP specifications. > > > > Janina > > > > Bob Lund writes: > >> The use case today in cable is descriptive video service where the > description and main dialogue are pre-mixed and delivered as a single > channel. The single channel is preferred because the majority of video > receivers (set-top-boxes and TVs) do not have the capability to mix > audio. There are emerging regulatory requirements to provide descriptive > video service (audio descriptions) so in the short term the single > channel approach might be more prevalent. Related to this is a desire by > content owners to not have to re-author content to deliver it on the > Web. > >> > >> If pre-mixed description and dialogue is identified as @kind = > alternative then there will need to be a way to distinguish it from > other types of alternative audio. @label could be used but then it will > be desirable to have some definition of @label string semantics so > clients know how to interpret them. > >> > >> Bob > >> > >> > -----Original Message----- > >> > From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com] > >> > Sent: Wednesday, June 15, 2011 10:56 PM > >> > To: Mark Watson > >> > Cc: Bob Lund; HTML Accessibility Task Force > >> > Subject: Re: Meaning of audio track kind 'descriptions' > >> > > >> > Note that I haven't yet seen a use case that absolutely requires us > >> > to know if a track is additional or alternative. If we do, we can > >> > always use a data-* attribute for this right now. If we see the > >> > data-* attribute being required to solve use cases, then we can ask > >> > for the introduction of an additional marker. > >> > > >> > Bob: what was your use case? > >> > > >> > Cheers, > >> > Silvia. > >> > > >> > On Thu, Jun 16, 2011 at 2:53 PM, Silvia Pfeiffer > >> > <silviapfeiffer1@gmail.com> wrote: > >> > > On Thu, Jun 16, 2011 at 1:07 PM, Mark Watson > >> > > <watsonm@netflix.com> > >> > wrote: > >> > >> I had a different understanding. > >> > >> > >> > >> We keep coming back to these cases where we can imagine both > >> > "alternative" and "additional" tracks as solutions to some problem. > >> > >> > >> > >> I've argued at length before that it doesn't work to have a > >> > >> blanket > >> > mechanism whereby any track can be labeled as either "alternative" > >> > or "additional" - and indeed we have no such mechanism: it's > >> > implicit in the track kind - you need to understand the kind to > >> > know whether it is alternative or additional. > >> > >> > >> > >> I actually thought that all our audio kinds were alternatives. > >> > >> I'm no > >> > expect, but I would guess that it's hard to create a descriptions > >> > track which can be freely mixed with the original audio. > >> > > > >> > > > >> > > I've done so before. It's not hard at all. You listen to the > >> > > original track and you speak into the microphone. It is easier to > >> > > record it in this way because the quality of the original audio > >> > > doesn't degrade. It is also the way in which for example the > jwplayer works: > >> > > http://www.longtailvideo.com/support/addons/audio-description/151 > >> > > 36/au > >> > > dio-description-reference-guide > >> > > . > >> > > > >> > > It would be bad if you have to mix in the original audio because > >> > > that both degrades the quality of that track, increases the > >> > > required bandwidth (because compressed silence is smaller than > >> > > compressed sound), requires re-recording the original content > >> > > (which might end up in copyright trouble), and requires switching > >> > > between tracks rather than just adding and removing a track. > >> > > Switching between tracks will be a lot more perceptible than > adding/removing a second track. > >> > > > >> > > So, I can only see advantages to having an audio description > >> > > provided as a separate track. > >> > > > >> > > > >> > >> If both kinds exists (alternative descriptions and additive > >> > descriptions), then we need two kind values. Given that it's an > >> > accessibility requirement it would be nice for it to be explicit, > >> > so I would expect to have two "descriptions" kinds e.g. > >> > descriptions-add and descriptions-alt. > >> > > > >> > > I've only ever seen audio descriptions that come as separate > tracks. > >> > > In the TV case you would have had to mix it for transmission > >> > > because there was only one channel available for transmission, > >> > > but I believe that is the artificial case. The more natural case > >> > > is to have them separate. > >> > > > >> > > > >> > > Cheers, > >> > > Silvia. > >> > > > > > > -- > > > > Janina Sajka, Phone: +1.443.300.2200 > > sip:janina@asterisk.rednote.net > > > > Chair, Open Accessibility janina@a11y.org Linux > Foundation > > http://a11y.org > > > > Chair, Protocols & Formats > > Web Accessibility Initiative http://www.w3.org/wai/pf World Wide > > Web Consortium (W3C) > > > >
Received on Friday, 17 June 2011 15:05:10 UTC