- From: Nigel Megitt <nigel.megitt@bbc.co.uk>
- Date: Thu, 26 Sep 2024 18:35:14 +0000
- To: "public-tt@w3.org" <public-tt@w3.org>, "public-audio-description@w3.org" <public-audio-description@w3.org>
- Message-ID: <EF9A9927-C634-4D7E-B455-283B52008672@bbc.co.uk>
Thanks all for attending today’s joint meeting of Audio Description Community Group and Timed Text Working Group. Minutes can be found in HTML format at https://www.w3.org/2024/09/26-tt-minutes.html In plain text: [1]W3C [1] https://www.w3.org/ Audio Description Community Group, Timed Text Working Group Joint Meeting 26 September 2024 [2]Agenda. [3]IRC log. [2] https://www.w3.org/wiki/TimedText/tpac2024#Thursday_26th_September [3] https://www.w3.org/2024/09/26-tt-irc Attendees Present Adam_Page, atsushi, bernd, Eric_Carlson, Färber, Hiroshi_Ohta, jcraig, Jer_Noble, Nigel_Megitt, Nikolaus Färber, ray-schwartz9, rwarren2, Wolfgang Schildbach (observing) Regrets - Chair Nigel Scribe nigel, jcraig Contents 1. [4]Introductions 2. [5]Agenda 3. [6]DAPT 4. [7]Meeting Close Meeting minutes Introductions Nigel: Nigel Megitt, BBC, chair ADCG, co-chair TTWG, one of the Editors of DAPT ray-schwartz9: Ray Schwartz: he/him NFCU, memeber of ARIA gabriel: eng on MS Edge, part of Web Audio atsushi: w3c contact TTWG <nigel> s/??/atsushi niko: Nikolas Fairburn, Media and Entertainment Interest Group bernd: member of Media and Entertainment Interest Group, and WICG <jcraig> s/??/niko/ jcraig: James Craig, Apple Accessibility, member of TTWG, interested in audio descriptions, most active in ARIA Adam_Page: Hilton Accessibility ARIA WG cyril: Netflix, TTWG Co-editor <atsushi> Hiroshi_Ohta: from LINE Yahoo Corp. reinaldoferraz: Reinaldo Ferraz, W3C chapter Sao Paulo, observer . sprang: Google Meet, Observer <reinaldoferraz> NIC.br wschi: Weiwei Xu, Huawei, Media Standard Department nigel: intros ahead of schedule … any other agenda topics? Agenda DAPT profiles in TTML Authoring and production workflows for Loc and AD no other agenda items cyril: have others deployed AD in recent deployments … MPEG-H, coding audio descriptions, etc wschi: used in broadcast and to mix them in encode/decode mixing not supported in browser yet cyril: used in VOD Robert Warren: niko agriculture, interest in the humanities jcraig: To Cyril's question, Apple has a number of different audio description features … as a streaming service provider with Apple TV+, I have no part of it but very proud of the work … we do with AD and captions. Most of the Apple Original content has 9 AD languages and 40 caption track. … On the product side, there are a number of features related to AD and captions that most people … are not aware of. For example if you are blind and have a screen reader you can choose to have … captions Brailled or spoken live (for translation). … Experimented with something similar for audio descriptions. … Eric Carlson and I demoed that 2 years ago in TPAC. … Take AD track type of the web <video> element, parsing it on the fly and either … speaking it or Brailling it. Silent descriptions sent live to the Braille display. … Someone who is deafblind could enjoy them widely. … Not deployed widely, just a tech demo … Love to get more interest in it, add the Braille support. … That demo was a custom implementation of WebVTT in the video player Adam_Page: Hilton deploying more AD to the webiste Adam_Page: Another data point. At Hilton, not a big video platform, … baking the audio track in jcraig: Second track is standard way to deploy AD now, with a tag saying it's AD, to support … auto selection … Technically just a standard audio track Adam_Page: user chooses the preferred track … most require extended. jcraig: One of the things was descriptions longer than the natural gap in the audio, e.g. … extended descriptions, we demoed auto pause of video in the player when that happened. … Have not seen a lot of deployments of that. … I think WGBH has some demos of extended description jcraig: demo in Vancouver was extended lecture paused to describe a chart nigel: BBC deploys choice between pre-mixed audio trck with AD versus w/o … also deploying a dry AD track (not mixed with main audio) plus mix data … DVB-T is widely deployed and supported in the UK … transport stream is specified in the UK's "d-book" cyril: UK-specific technology? nigel: yes, since 2006 or so <rwarren2> "Widely available for a large amount of money" ;) nigel: That's the broadcast standard … online we deploy separate video files like Hilton … lately starting to deploy Live descriptions... timing is an artform.. the describers research ahead of time cyril: is the describer local or remote? nigel: either... third-party service Adam_Page: English only? nigel: yes wschi: thinking of replicating the live AD use case into browsers as well? nigel: 3-4 yrs ago, demoed TTML2 with live mix instructions in the browser … could be used for live broadcast, too … tech demo can mix two audio tracksusing mix data (well) and/or less-well mixed with text-to-speech .. (generated speech synthesis) … could deploy as sADM, or we may deploy as a custom implementation in the BBC player niko: NGA can include AD, and spatial position to separate Object-based audio Object-based audio bernd: demo and discussed in the Media and Entertainment Interest Group this past Monday.... eric_carlson: WebKit eng at Apple, inc TT jernoble: Jer Noble... WebKit Engineer at Apple, and TT cyril: how widely deployed is AD around the world? … Are there countries with no AD? distribution? nigel: BBC audio describes over 20% of our programmes, regulatory requirement is 10%. … other countries do some percentage jcraig: smaller percentages cyril: is AD deployed widely in Japan? Hiroshi_Ohta: audio subchannels are popular in Japan... for example, background data about baseball players during games.... Not as widely deployed for AD for the Blind nigel: most recent olympic games in Japan included additional data (NHK?) on the subchannel generating AD about the scores ??: not sure of which subchannels are auto-generated or not? nigel: one development in AD that has been gaining in use is synthesized voices … there is an advocacy group in UK that has been running user test experiments … Royal National Institute for the Blind (RNIB) Nigel: new attendee? dana: Hi!. I'm Dana. I work on WebKit. jcraig: Deployments in Japan: more common with streaming services. … Japanese is one of the languages that Apple Original content localises to … HBO is starting to ramp up as well, and starting to lead the way with signed / PiP / ASL movies, … being deployed as separate video files because there isn't a way to compose the dry components … and keep them in sync nigel: on the tangent of sign interpretation, there is a new regulatory requirement in Spain … 3cat (Catalonian broadcaster) recently demoed an HTTBV receiver? … Got the signing stream over IP and recomposited, implementation in WASM … I think the resolution of the signed video was lower than the main broadcast video jcraig: resolution does not need to be as high, but high framerate is critical with sign language... easy to lose context with dropped frames DAPT nigel: this spec has originations back a few years … TTML2 could trigger audio playback, pitch, etc, audio mixing etc … but in general TTML is a TT format... I tried to do an AD variant, but had not as broad uptake … so DAPT is AD plus mainstream dubbing as use cases … and other uses … thinking of production workflows, video will be commissioned and produced... Loc and AD comes later as a second step … usually need a transcript.... for SDH subtitles or localized translation subtitles … cyril said these processes are sometimes too removed, and the dubbing plus translation can be mismatched … trying to convince content producers to move the transcription process earlier in the chain … a lot of the service providers use proprietary tools <Zakim> jcraig, you wanted to mention FCC DAC report jcraig: I can share afterwards - I'm Apple's rep to the FCC disability advisory committee … and worked on a report for the commission with other people, which is public, I'll share the link later, … which is effectively guidelines and recommendations for broadcasters and streamers for how … to do exactly this, and which specific resources should be deployed widely with the original video. … A lot of time the contracts for production do not include the accessible alternatives, for subtitles, … descriptions, translations etc. … So then when the content goes to cable providers etc., the recommendations talk about this … particular item, the audio description transcript and ideally timing, as well as the dialogue, … should as much as possible be considered and distributed by the original distributor, to avoid this mismatch. … Which avoids rework and mismatch when there's already prior work that's been done. … Redub with different transcripts etc cause those problems. nigel: Chris intro? cpn: Chris Needham: BBC, Chair Media WG nigel: broadly speaking, DAPT useful as a production tool … for Timed Text, audio, etc <ray-schwartz9> Need to head to another meeting. Thanks for letting me sit in! nigel: mostly upstream of something that would go to the client devices, but DAPT could go directly to the player, .. for braille or TTS, local audio mix, etc. … including pans, levels, etc nigel: intro? youenn: Youenn Fablet, Apple nigel: doc includes examples to help people understand the use cases … tracking for translation,the current language and original lang ("pivot languages?") ex: norwegian to hebrew.... probably passed through English as a pivot language … so by tracking through this, you may have a better idea of how to avoid or correct mistranslations … metadata describes characters (the type portrayed by actors) and other info … metadata could differentiate visual description vs transliteration of text rendered visually on screen... (time or location chyrons, as an example) … [scanning through the document]... showing timed text example of AD... along with mixing data … also can include prerecorded audio … [showing Gain attribute data] … result is that it ducks the main program audio while AD mix is played, and re-raises the gain after the "ducking" <Zakim> jcraig, you wanted to ask about ducking prefs jcraig: Screen readers often have a setting for ducking audio, not used when there's pre-mixed audio … Is there more data here than just the gain, like a context, like "this is a ducking transition", … because that would potentially allow the user preference for ducking. … Is there semantic information about why the transitions are happening? cyril: I don't think we thought about that use case, semantic signalling, … but I see that it could easily be added - TTML is easily extended, either in the spec or with … proprietary information. jcraig: Talking about sub-channel audio for a baseball game earlier, some people might want … to hear that in the same room as others who do not. … That mixed data could be deployed to a different channel or speaker. … That type of semantic metadata could also apply. rwarren2: A friend who has gone blind late in his life: enjoys baseball, … but now it's not on the radio, there's a change in the announcement style … It's frustrating because they no longer know what the action is, because the assumption … is that you can see what's happening. jcraig: Anecdotally, I have a lot of blind friends into baseball, who would like that. My assumption is because the position don't move, and you can build a mental picture based on action that is described well, like it used to be on radio. nigel: Irish commission researched about appropriate ducking levels based on how loud the program audio is, how much to duck, and how loud the AD should be. [8]Investigating a Standardised Approach to Setting Audio Description Dip Values [8] https://www.cnam.ie/wp-content/uploads/2023/06/20230619_Investigating-a-Standardised-Approach-to-Setting-Audio-Description-Dip-Values_vFINAL-1.pdf nigel: so that the background programme sound does not drown out the description … so "one size" does not "fit all" when it comes to audio ducking <Zakim> nigel, you wanted to react to jcraig to answer that nigel: anecdotal data point, visited VRT in Belgium would hand tweaking gain to allow un-ducking relevant noise ("door opening") during AD dialog, to improve understandability wschi: how do you stream the XML? nigel: could be one big file... … Or MPEG-DASH, HLS, etc. <wschi> ST2110-43# cyril: RTP payload ST2110-43 wschi: could be very high bitrate? cyril: might be similar to a lower bitrate for voice-only (not full mix) nigel: which options would we need? jcraig: saw one anti-pattern with a streamer who deployed Dolby Atmos, bt the AD track was flattend mixed to Stereo Nigel: was on AD examples... There are also Dubbing example cyril: focus on AD to ask for feedback? nigel: structure includes data model separate from the TTML … recording or synthesized, with optional mix data … within the spec, each class or object type is described... no need to have a full understanding of TTML to understand it. cyril: request feedback on AD... are there use cases not included? identifying gaps, etc? wschi: very expressive about audio features... are there interactive (user pref) controls about how that mix would work? jcraig: Games are often very customisable, different sliders for different game sound effects. … Even tweaks for things that might be considered triggers or scare warnings, … that level of distinction. … All custom, but deployed because users are asking for those features. nigel: implementations... authoring... conversion tools, etc … expecting more activity in order to meet the goals of the community need wschi: re: deployment, is NGA not there yet? nigel: not dependent on the format... nigel: perhaps URI or fragment id for this? cyril: I don't think there is a standard in ISOBMFF? to spec a subtrack of a subtrack? jernoble: for HLS, there are variants , not really tracks... Nigel: tech discussion should continue into the hallway cyril: please review and provide feedback nigel: also discussing related topics tomorrow … hope to get to CR soon jcraig: The FCC Disability Advisory Committee (DAC) report on "Audio Description File Transmittal for Internet Protocol Delivered Video Programming" [9]https://www.fcc.gov/ecfs/ document/10208388924441/1 … Word/PDF/.. PDF linked from Word/PDF/.. Word/PDF linked from/PDF linked from under the Recommendations heading: [10]https://www.fcc.gov/audio-description … most relevant, the section as the end on "Potential Opportunities in the Audio Description Ecosystem for Participants and the … Commission" covers recommendations like: … - Encourage vendors to provide and content creators to request AD scripts with timestamps in addition to the AD audio files. … - Encourage vendors to deliver these unmixed [AD] audio files to stakeholders. [9] https://www.fcc.gov/ecfs/document/10208388924441/1 [10] https://www.fcc.gov/audio-description Meeting Close nigel: Thank you everyone, very interesting discussion points, we're out of time [adjourns meeting] Minutes manually created (not a transcript), formatted by [11]scribe.perl version 229 (Thu Jul 25 08:38:54 2024 UTC). [11] https://w3c.github.io/scribe2/scribedoc.html
Received on Thursday, 26 September 2024 18:35:30 UTC