{Minutes} Audio Description Community Group, Timed Text Working Group Joint Meeting 2024-09-26

Thanks all for attending today’s joint meeting of Audio Description Community Group and Timed Text Working Group. Minutes can be found in HTML format at https://www.w3.org/2024/09/26-tt-minutes.html


In plain text:

   [1]W3C

      [1] https://www.w3.org/


   Audio Description Community Group, Timed Text Working Group Joint
                                Meeting

26 September 2024

   [2]Agenda. [3]IRC log.

      [2] https://www.w3.org/wiki/TimedText/tpac2024#Thursday_26th_September

      [3] https://www.w3.org/2024/09/26-tt-irc


Attendees

   Present
          Adam_Page, atsushi, bernd, Eric_Carlson, Färber,
          Hiroshi_Ohta, jcraig, Jer_Noble, Nigel_Megitt, Nikolaus
          Färber, ray-schwartz9, rwarren2, Wolfgang Schildbach
          (observing)

   Regrets
          -

   Chair
          Nigel

   Scribe
          nigel, jcraig

Contents

    1. [4]Introductions
    2. [5]Agenda
    3. [6]DAPT
    4. [7]Meeting Close

Meeting minutes

  Introductions

   Nigel: Nigel Megitt, BBC, chair ADCG, co-chair TTWG, one of the
   Editors of DAPT

   ray-schwartz9: Ray Schwartz: he/him NFCU, memeber of ARIA

   gabriel: eng on MS Edge, part of Web Audio

   atsushi: w3c contact TTWG

   <nigel> s/??/atsushi

   niko: Nikolas Fairburn, Media and Entertainment Interest Group

   bernd: member of Media and Entertainment Interest Group, and
   WICG

   <jcraig> s/??/niko/

   jcraig: James Craig, Apple Accessibility, member of TTWG,
   interested in audio descriptions, most active in ARIA

   Adam_Page: Hilton Accessibility ARIA WG

   cyril: Netflix, TTWG Co-editor

   <atsushi> Hiroshi_Ohta: from LINE Yahoo Corp.

   reinaldoferraz: Reinaldo Ferraz, W3C chapter Sao Paulo,
   observer

   .

   sprang: Google Meet, Observer

   <reinaldoferraz> NIC.br

   wschi: Weiwei Xu, Huawei, Media Standard Department

   nigel: intros ahead of schedule
   … any other agenda topics?

  Agenda

   DAPT profiles in TTML

   Authoring and production workflows for Loc and AD

   no other agenda items

   cyril: have others deployed AD in recent deployments
   … MPEG-H, coding audio descriptions, etc

   wschi: used in broadcast and to mix them in encode/decode

   mixing not supported in browser yet

   cyril: used in VOD

   Robert Warren: niko agriculture, interest in the humanities

   jcraig: To Cyril's question, Apple has a number of different
   audio description features
   … as a streaming service provider with Apple TV+, I have no
   part of it but very proud of the work
   … we do with AD and captions. Most of the Apple Original
   content has 9 AD languages and 40 caption track.
   … On the product side, there are a number of features related
   to AD and captions that most people
   … are not aware of. For example if you are blind and have a
   screen reader you can choose to have
   … captions Brailled or spoken live (for translation).
   … Experimented with something similar for audio descriptions.
   … Eric Carlson and I demoed that 2 years ago in TPAC.
   … Take AD track type of the web <video> element, parsing it on
   the fly and either
   … speaking it or Brailling it. Silent descriptions sent live to
   the Braille display.
   … Someone who is deafblind could enjoy them widely.
   … Not deployed widely, just a tech demo
   … Love to get more interest in it, add the Braille support.
   … That demo was a custom implementation of WebVTT in the video
   player

   Adam_Page: Hilton deploying more AD to the webiste

   Adam_Page: Another data point. At Hilton, not a big video
   platform,
   … baking the audio track in

   jcraig: Second track is standard way to deploy AD now, with a
   tag saying it's AD, to support
   … auto selection
   … Technically just a standard audio track

   Adam_Page: user chooses the preferred track
   … most require extended.

   jcraig: One of the things was descriptions longer than the
   natural gap in the audio, e.g.
   … extended descriptions, we demoed auto pause of video in the
   player when that happened.
   … Have not seen a lot of deployments of that.
   … I think WGBH has some demos of extended description

   jcraig: demo in Vancouver was extended lecture paused to
   describe a chart

   nigel: BBC deploys choice between pre-mixed audio trck with AD
   versus w/o
   … also deploying a dry AD track (not mixed with main audio)
   plus mix data
   … DVB-T is widely deployed and supported in the UK
   … transport stream is specified in the UK's "d-book"

   cyril: UK-specific technology?

   nigel: yes, since 2006 or so

   <rwarren2> "Widely available for a large amount of money" ;)

   nigel: That's the broadcast standard
   … online we deploy separate video files like Hilton
   … lately starting to deploy Live descriptions... timing is an
   artform.. the describers research ahead of time

   cyril: is the describer local or remote?

   nigel: either... third-party service

   Adam_Page: English only?

   nigel: yes

   wschi: thinking of replicating the live AD use case into
   browsers as well?

   nigel: 3-4 yrs ago, demoed TTML2 with live mix instructions in
   the browser
   … could be used for live broadcast, too
   … tech demo can mix two audio tracksusing mix data (well)
   and/or less-well mixed with text-to-speech .. (generated speech
   synthesis)
   … could deploy as sADM, or we may deploy as a custom
   implementation in the BBC player

   niko: NGA can include AD, and spatial position to separate
   Object-based audio

   Object-based audio

   bernd: demo and discussed in the Media and Entertainment
   Interest Group this past Monday....

   eric_carlson: WebKit eng at Apple, inc TT

   jernoble: Jer Noble... WebKit Engineer at Apple, and TT

   cyril: how widely deployed is AD around the world?
   … Are there countries with no AD? distribution?

   nigel: BBC audio describes over 20% of our programmes,
   regulatory requirement is 10%.
   … other countries do some percentage

   jcraig: smaller percentages

   cyril: is AD deployed widely in Japan?

   Hiroshi_Ohta: audio subchannels are popular in Japan... for
   example, background data about baseball players during
   games.... Not as widely deployed for AD for the Blind

   nigel: most recent olympic games in Japan included additional
   data (NHK?) on the subchannel generating AD about the scores

   ??: not sure of which subchannels are auto-generated or not?

   nigel: one development in AD that has been gaining in use is
   synthesized voices
   … there is an advocacy group in UK that has been running user
   test experiments
   … Royal National Institute for the Blind (RNIB)

   Nigel: new attendee?

   dana: Hi!. I'm Dana. I work on WebKit.

   jcraig: Deployments in Japan: more common with streaming
   services.
   … Japanese is one of the languages that Apple Original content
   localises to
   … HBO is starting to ramp up as well, and starting to lead the
   way with signed / PiP / ASL movies,
   … being deployed as separate video files because there isn't a
   way to compose the dry components
   … and keep them in sync

   nigel: on the tangent of sign interpretation, there is a new
   regulatory requirement in Spain
   … 3cat (Catalonian broadcaster) recently demoed an HTTBV
   receiver?
   … Got the signing stream over IP and recomposited,
   implementation in WASM
   … I think the resolution of the signed video was lower than the
   main broadcast video

   jcraig: resolution does not need to be as high, but high
   framerate is critical with sign language... easy to lose
   context with dropped frames

  DAPT

   nigel: this spec has originations back a few years
   … TTML2 could trigger audio playback, pitch, etc, audio mixing
   etc
   … but in general TTML is a TT format... I tried to do an AD
   variant, but had not as broad uptake
   … so DAPT is AD plus mainstream dubbing as use cases
   … and other uses
   … thinking of production workflows, video will be commissioned
   and produced... Loc and AD comes later as a second step
   … usually need a transcript.... for SDH subtitles or localized
   translation subtitles
   … cyril said these processes are sometimes too removed, and the
   dubbing plus translation can be mismatched
   … trying to convince content producers to move the
   transcription process earlier in the chain
   … a lot of the service providers use proprietary tools

   <Zakim> jcraig, you wanted to mention FCC DAC report

   jcraig: I can share afterwards - I'm Apple's rep to the FCC
   disability advisory committee
   … and worked on a report for the commission with other people,
   which is public, I'll share the link later,
   … which is effectively guidelines and recommendations for
   broadcasters and streamers for how
   … to do exactly this, and which specific resources should be
   deployed widely with the original video.
   … A lot of time the contracts for production do not include the
   accessible alternatives, for subtitles,
   … descriptions, translations etc.
   … So then when the content goes to cable providers etc., the
   recommendations talk about this
   … particular item, the audio description transcript and ideally
   timing, as well as the dialogue,
   … should as much as possible be considered and distributed by
   the original distributor, to avoid this mismatch.
   … Which avoids rework and mismatch when there's already prior
   work that's been done.
   … Redub with different transcripts etc cause those problems.

   nigel: Chris intro?

   cpn: Chris Needham: BBC, Chair Media WG

   nigel: broadly speaking, DAPT useful as a production tool
   … for Timed Text, audio, etc

   <ray-schwartz9> Need to head to another meeting. Thanks for
   letting me sit in!

   nigel: mostly upstream of something that would go to the client
   devices, but DAPT could go directly to the player, .. for
   braille or TTS, local audio mix, etc.
   … including pans, levels, etc

   nigel: intro?

   youenn: Youenn Fablet, Apple

   nigel: doc includes examples to help people understand the use
   cases
   … tracking for translation,the current language and original
   lang ("pivot languages?")

   ex: norwegian to hebrew.... probably passed through English as
   a pivot language
   … so by tracking through this, you may have a better idea of
   how to avoid or correct mistranslations
   … metadata describes characters (the type portrayed by actors)
   and other info
   … metadata could differentiate visual description vs
   transliteration of text rendered visually on screen... (time or
   location chyrons, as an example)
   … [scanning through the document]... showing timed text example
   of AD... along with mixing data
   … also can include prerecorded audio
   … [showing Gain attribute data]
   … result is that it ducks the main program audio while AD mix
   is played, and re-raises the gain after the "ducking"

   <Zakim> jcraig, you wanted to ask about ducking prefs

   jcraig: Screen readers often have a setting for ducking audio,
   not used when there's pre-mixed audio
   … Is there more data here than just the gain, like a context,
   like "this is a ducking transition",
   … because that would potentially allow the user preference for
   ducking.
   … Is there semantic information about why the transitions are
   happening?

   cyril: I don't think we thought about that use case, semantic
   signalling,
   … but I see that it could easily be added - TTML is easily
   extended, either in the spec or with
   … proprietary information.

   jcraig: Talking about sub-channel audio for a baseball game
   earlier, some people might want
   … to hear that in the same room as others who do not.
   … That mixed data could be deployed to a different channel or
   speaker.
   … That type of semantic metadata could also apply.

   rwarren2: A friend who has gone blind late in his life: enjoys
   baseball,
   … but now it's not on the radio, there's a change in the
   announcement style
   … It's frustrating because they no longer know what the action
   is, because the assumption
   … is that you can see what's happening.

   jcraig: Anecdotally, I have a lot of blind friends into
   baseball, who would like that. My assumption is because the
   position don't move, and you can build a mental picture based
   on action that is described well, like it used to be on radio.

   nigel: Irish commission researched about appropriate ducking
   levels based on how loud the program audio is, how much to
   duck, and how loud the AD should be.

   [8]Investigating a Standardised Approach to Setting Audio
   Description Dip Values

      [8] https://www.cnam.ie/wp-content/uploads/2023/06/20230619_Investigating-a-Standardised-Approach-to-Setting-Audio-Description-Dip-Values_vFINAL-1.pdf


   nigel: so that the background programme sound does not drown
   out the description
   … so "one size" does not "fit all" when it comes to audio
   ducking

   <Zakim> nigel, you wanted to react to jcraig to answer that

   nigel: anecdotal data point, visited VRT in Belgium would hand
   tweaking gain to allow un-ducking relevant noise ("door
   opening") during AD dialog, to improve understandability

   wschi: how do you stream the XML?

   nigel: could be one big file...
   … Or MPEG-DASH, HLS, etc.

   <wschi> ST2110-43#

   cyril: RTP payload ST2110-43

   wschi: could be very high bitrate?

   cyril: might be similar to a lower bitrate for voice-only (not
   full mix)

   nigel: which options would we need?

   jcraig: saw one anti-pattern with a streamer who deployed Dolby
   Atmos, bt the AD track was flattend mixed to Stereo

   Nigel: was on AD examples... There are also Dubbing example

   cyril: focus on AD to ask for feedback?

   nigel: structure includes data model separate from the TTML
   … recording or synthesized, with optional mix data
   … within the spec, each class or object type is described... no
   need to have a full understanding of TTML to understand it.

   cyril: request feedback on AD... are there use cases not
   included? identifying gaps, etc?

   wschi: very expressive about audio features... are there
   interactive (user pref) controls about how that mix would work?

   jcraig: Games are often very customisable, different sliders
   for different game sound effects.
   … Even tweaks for things that might be considered triggers or
   scare warnings,
   … that level of distinction.
   … All custom, but deployed because users are asking for those
   features.

   nigel: implementations... authoring... conversion tools, etc
   … expecting more activity in order to meet the goals of the
   community need

   wschi: re: deployment, is NGA not there yet?

   nigel: not dependent on the format...

   nigel: perhaps URI or fragment id for this?

   cyril: I don't think there is a standard in ISOBMFF? to spec a
   subtrack of a subtrack?

   jernoble: for HLS, there are variants , not really tracks...

   Nigel: tech discussion should continue into the hallway

   cyril: please review and provide feedback

   nigel: also discussing related topics tomorrow
   … hope to get to CR soon

   jcraig: The FCC Disability Advisory Committee (DAC) report on
   "Audio Description File Transmittal for Internet Protocol
   Delivered Video Programming" [9]https://www.fcc.gov/ecfs/

   document/10208388924441/1
   … Word/PDF/.. PDF linked from Word/PDF/.. Word/PDF linked
   from/PDF linked from under the Recommendations heading:
   [10]https://www.fcc.gov/audio-description

   … most relevant, the section as the end on "Potential
   Opportunities in the Audio Description Ecosystem for
   Participants and the
   … Commission" covers recommendations like:
   … - Encourage vendors to provide and content creators to
   request AD scripts with timestamps in addition to the AD audio
   files.
   … - Encourage vendors to deliver these unmixed [AD] audio files
   to stakeholders.

      [9] https://www.fcc.gov/ecfs/document/10208388924441/1

     [10] https://www.fcc.gov/audio-description


  Meeting Close

   nigel: Thank you everyone, very interesting discussion points,
   we're out of time [adjourns meeting]


    Minutes manually created (not a transcript), formatted by
    [11]scribe.perl version 229 (Thu Jul 25 08:38:54 2024 UTC).

     [11] https://w3c.github.io/scribe2/scribedoc.html

Received on Thursday, 26 September 2024 18:35:30 UTC