- From: David Singer <singer@apple.com>
- Date: Mon, 20 Jun 2011 12:34:17 +0200
- To: Mark Watson <watsonm@netflix.com>
- Cc: Bob Lund <B.Lund@cablelabs.com>, Silvia Pfeiffer <silviapfeiffer1@gmail.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
On Jun 20, 2011, at 10:55 , Mark Watson wrote: > Is it not the case than when authoring audio descriptions you might want to make decisions to attenuate part of the main audio track to make the descriptions more audible ? Yes. I think both cases arise: a) the main audio has enough gaps in it that I can overlay the descriptions audio track on it without changing it; b) the main audio has to be 'doctored' (level adjustments, maybe shifted around a bit) to leave room for the descriptions. (a) is covered by an additional audio track. (b) by a replacement. > > That would be a reason other than legacy technical restrictions for creating "alternative" audio descriptions tracks. > > Silvia: I didn't understand you comment of not seeing a need to distinguish between alternative and additional tracks: surely I need to know this so that I know whether to enable this track in addition to the main track or instead of the main track. > > It seems like we have use-cases for both alternative and additional audio descriptions. I think the track should be explicitly marked as descriptions in both cases, so we can apply user preferences, include it in the right menu etc. > > So, how should be distinguish the two cases ? We could have two kind values or use data-*as Silvia suggested (Silvia, could you explain how that works ?). > > ...Mark > > > > On Jun 17, 2011, at 5:04 PM, Bob Lund wrote: > >> >> >> >>> -----Original Message----- >>> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com] >>> Sent: Thursday, June 16, 2011 5:18 PM >>> To: Bob Lund; Silvia Pfeiffer; Mark Watson; HTML Accessibility Task >>> Force >>> Subject: Re: Meaning of audio track kind 'descriptions' >>> >>> I agree with the way that Janina describes the situation and the future. >>> >>> I also understand Bob's current situation of having to deal with set-top >>> boxes and TVs. >> >> It's a fair point that legacy set-top boxes and TVs might not be the primary target for future browser-based clients. The important underlying issue, from what I hear, is that content owners would prefer not to re-author content, already being delivered to legacy devices, for browser-based clients. >> >>> >>> Bob, I further wonder: when you get the described audio as mixed-in, do >>> you get it as a separate audio track or is it actually a completely >>> different video file? So do you deal with one video file (with original >>> video track + original audio track) and one audio file (with mixed audio >>> + descriptions) or do you deal with two video files (one with the >>> original audio track and one with the mixed one)? >> >> The video in question is delivered as MPEG-2 multi-program transport streams, rather than file based. The MPEG-2 MPTS has multiple programs, where each might have 1 video stream, 1 main dialogue audio track and a secondary audio track consisting of the main dialogue + audio description. MPEG-2 TS can also carry MPEG-4 elementary media streams. This same multiplexing structure (1 program with multiple audio tracks) can also be replicated in the MPEG-4 base file format and used with adaptive delivery. >> >> Bob >> >>> >>> Silvia. >>> >>> On Fri, Jun 17, 2011 at 5:15 AM, Janina Sajka <janina@rednote.net> >>> wrote: >>>> Bob is correct, imho. But please note the reason human narrated audio >>>> description is pre-mixed with the audio of the primary resource and >>>> delivered in a single channelis the historical technology that first >>>> brought such content to market. You just couldn't do anything else in >>>> analog TV, where only the SAP channel was available for this content, >>>> because the home premises equipment wasn't designed to do audio >>> mixing. >>>> >>>> This history also condemmed the described video to mono playback. >>>> >>>> While this model will predominate early on, I'm by no means convinced >>>> it's it describes the future. To start with, it would sure be nice to >>>> have stereo sound. Oh, and it would be nice to be able to >>>> independently adjust the volume of the video descriptions, and even to >>>> direct them at specific audio devices while audio from the primary >>>> resource is played through different audio devices. >>>> >>>> >>>> Given that many people with disabilities have multiple disabilities >>>> and hearing loss together with blindness is by no means uncommon, >>>> separating the description track from the primary audio has >>>> advantages. And, now we have a technology platform that can deliver >>>> it, unlike the 1950's SAP specifications. >>>> >>>> Janina >>>> >>>> Bob Lund writes: >>>>> The use case today in cable is descriptive video service where the >>> description and main dialogue are pre-mixed and delivered as a single >>> channel. The single channel is preferred because the majority of video >>> receivers (set-top-boxes and TVs) do not have the capability to mix >>> audio. There are emerging regulatory requirements to provide descriptive >>> video service (audio descriptions) so in the short term the single >>> channel approach might be more prevalent. Related to this is a desire by >>> content owners to not have to re-author content to deliver it on the >>> Web. >>>>> >>>>> If pre-mixed description and dialogue is identified as @kind = >>> alternative then there will need to be a way to distinguish it from >>> other types of alternative audio. @label could be used but then it will >>> be desirable to have some definition of @label string semantics so >>> clients know how to interpret them. >>>>> >>>>> Bob >>>>> >>>>>> -----Original Message----- >>>>>> From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com] >>>>>> Sent: Wednesday, June 15, 2011 10:56 PM >>>>>> To: Mark Watson >>>>>> Cc: Bob Lund; HTML Accessibility Task Force >>>>>> Subject: Re: Meaning of audio track kind 'descriptions' >>>>>> >>>>>> Note that I haven't yet seen a use case that absolutely requires us >>>>>> to know if a track is additional or alternative. If we do, we can >>>>>> always use a data-* attribute for this right now. If we see the >>>>>> data-* attribute being required to solve use cases, then we can ask >>>>>> for the introduction of an additional marker. >>>>>> >>>>>> Bob: what was your use case? >>>>>> >>>>>> Cheers, >>>>>> Silvia. >>>>>> >>>>>> On Thu, Jun 16, 2011 at 2:53 PM, Silvia Pfeiffer >>>>>> <silviapfeiffer1@gmail.com> wrote: >>>>>>> On Thu, Jun 16, 2011 at 1:07 PM, Mark Watson >>>>>>> <watsonm@netflix.com> >>>>>> wrote: >>>>>>>> I had a different understanding. >>>>>>>> >>>>>>>> We keep coming back to these cases where we can imagine both >>>>>> "alternative" and "additional" tracks as solutions to some problem. >>>>>>>> >>>>>>>> I've argued at length before that it doesn't work to have a >>>>>>>> blanket >>>>>> mechanism whereby any track can be labeled as either "alternative" >>>>>> or "additional" - and indeed we have no such mechanism: it's >>>>>> implicit in the track kind - you need to understand the kind to >>>>>> know whether it is alternative or additional. >>>>>>>> >>>>>>>> I actually thought that all our audio kinds were alternatives. >>>>>>>> I'm no >>>>>> expect, but I would guess that it's hard to create a descriptions >>>>>> track which can be freely mixed with the original audio. >>>>>>> >>>>>>> >>>>>>> I've done so before. It's not hard at all. You listen to the >>>>>>> original track and you speak into the microphone. It is easier to >>>>>>> record it in this way because the quality of the original audio >>>>>>> doesn't degrade. It is also the way in which for example the >>> jwplayer works: >>>>>>> http://www.longtailvideo.com/support/addons/audio-description/151 >>>>>>> 36/au >>>>>>> dio-description-reference-guide >>>>>>> . >>>>>>> >>>>>>> It would be bad if you have to mix in the original audio because >>>>>>> that both degrades the quality of that track, increases the >>>>>>> required bandwidth (because compressed silence is smaller than >>>>>>> compressed sound), requires re-recording the original content >>>>>>> (which might end up in copyright trouble), and requires switching >>>>>>> between tracks rather than just adding and removing a track. >>>>>>> Switching between tracks will be a lot more perceptible than >>> adding/removing a second track. >>>>>>> >>>>>>> So, I can only see advantages to having an audio description >>>>>>> provided as a separate track. >>>>>>> >>>>>>> >>>>>>>> If both kinds exists (alternative descriptions and additive >>>>>> descriptions), then we need two kind values. Given that it's an >>>>>> accessibility requirement it would be nice for it to be explicit, >>>>>> so I would expect to have two "descriptions" kinds e.g. >>>>>> descriptions-add and descriptions-alt. >>>>>>> >>>>>>> I've only ever seen audio descriptions that come as separate >>> tracks. >>>>>>> In the TV case you would have had to mix it for transmission >>>>>>> because there was only one channel available for transmission, >>>>>>> but I believe that is the artificial case. The more natural case >>>>>>> is to have them separate. >>>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> Silvia. >>>>>>> >>>> >>>> -- >>>> >>>> Janina Sajka, Phone: +1.443.300.2200 >>>> sip:janina@asterisk.rednote.net >>>> >>>> Chair, Open Accessibility janina@a11y.org Linux >>> Foundation >>>> http://a11y.org >>>> >>>> Chair, Protocols & Formats >>>> Web Accessibility Initiative http://www.w3.org/wai/pf World Wide >>>> Web Consortium (W3C) >>>> >>>> >> > > David Singer Multimedia and Software Standards, Apple Inc.
Received on Monday, 20 June 2011 10:34:47 UTC