{Minutes} Audio Description Community Group, Timed Text Working Group Joint Meeting 2024-09-26 from Nigel Megitt on 2024-09-26 (public-tt@w3.org from September 2024)

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Thu, 26 Sep 2024 18:35:14 +0000
To: "public-tt@w3.org" <public-tt@w3.org>, "public-audio-description@w3.org" <public-audio-description@w3.org>
Message-ID: <EF9A9927-C634-4D7E-B455-283B52008672@bbc.co.uk>

Thanks all for attending today’s joint meeting of Audio Description Community Group and Timed Text Working Group. Minutes can be found in HTML format at https://www.w3.org/2024/09/26-tt-minutes.html

In plain text:

[1]W3C

[1] https://www.w3.org/

Audio Description Community Group, Timed Text Working Group Joint
Meeting

26 September 2024

[2]Agenda. [3]IRC log.

[2] https://www.w3.org/wiki/TimedText/tpac2024#Thursday_26th_September

[3] https://www.w3.org/2024/09/26-tt-irc

Attendees

Present
Adam_Page, atsushi, bernd, Eric_Carlson, Färber,
Hiroshi_Ohta, jcraig, Jer_Noble, Nigel_Megitt, Nikolaus
Färber, ray-schwartz9, rwarren2, Wolfgang Schildbach
(observing)

Regrets
-

Chair
Nigel

Scribe
nigel, jcraig

Contents

1. [4]Introductions
2. [5]Agenda
3. [6]DAPT
4. [7]Meeting Close

Meeting minutes

Introductions

Nigel: Nigel Megitt, BBC, chair ADCG, co-chair TTWG, one of the
Editors of DAPT

ray-schwartz9: Ray Schwartz: he/him NFCU, memeber of ARIA

gabriel: eng on MS Edge, part of Web Audio

atsushi: w3c contact TTWG

<nigel> s/??/atsushi

niko: Nikolas Fairburn, Media and Entertainment Interest Group

bernd: member of Media and Entertainment Interest Group, and
WICG

<jcraig> s/??/niko/

jcraig: James Craig, Apple Accessibility, member of TTWG,
interested in audio descriptions, most active in ARIA

Adam_Page: Hilton Accessibility ARIA WG

cyril: Netflix, TTWG Co-editor

<atsushi> Hiroshi_Ohta: from LINE Yahoo Corp.

reinaldoferraz: Reinaldo Ferraz, W3C chapter Sao Paulo,
observer

sprang: Google Meet, Observer

<reinaldoferraz> NIC.br

wschi: Weiwei Xu, Huawei, Media Standard Department

nigel: intros ahead of schedule
… any other agenda topics?

Agenda

DAPT profiles in TTML

Authoring and production workflows for Loc and AD

no other agenda items

cyril: have others deployed AD in recent deployments
… MPEG-H, coding audio descriptions, etc

wschi: used in broadcast and to mix them in encode/decode

mixing not supported in browser yet

cyril: used in VOD

Robert Warren: niko agriculture, interest in the humanities

jcraig: To Cyril's question, Apple has a number of different
audio description features
… as a streaming service provider with Apple TV+, I have no
part of it but very proud of the work
… we do with AD and captions. Most of the Apple Original
content has 9 AD languages and 40 caption track.
… On the product side, there are a number of features related
to AD and captions that most people
… are not aware of. For example if you are blind and have a
screen reader you can choose to have
… captions Brailled or spoken live (for translation).
… Experimented with something similar for audio descriptions.
… Eric Carlson and I demoed that 2 years ago in TPAC.
… Take AD track type of the web <video> element, parsing it on
the fly and either
… speaking it or Brailling it. Silent descriptions sent live to
the Braille display.
… Someone who is deafblind could enjoy them widely.
… Not deployed widely, just a tech demo
… Love to get more interest in it, add the Braille support.
… That demo was a custom implementation of WebVTT in the video
player

Adam_Page: Hilton deploying more AD to the webiste

Adam_Page: Another data point. At Hilton, not a big video
platform,
… baking the audio track in

jcraig: Second track is standard way to deploy AD now, with a
tag saying it's AD, to support
… auto selection
… Technically just a standard audio track

Adam_Page: user chooses the preferred track
… most require extended.

jcraig: One of the things was descriptions longer than the
natural gap in the audio, e.g.
… extended descriptions, we demoed auto pause of video in the
player when that happened.
… Have not seen a lot of deployments of that.
… I think WGBH has some demos of extended description

jcraig: demo in Vancouver was extended lecture paused to
describe a chart

nigel: BBC deploys choice between pre-mixed audio trck with AD
versus w/o
… also deploying a dry AD track (not mixed with main audio)
plus mix data
… DVB-T is widely deployed and supported in the UK
… transport stream is specified in the UK's "d-book"

cyril: UK-specific technology?

nigel: yes, since 2006 or so

<rwarren2> "Widely available for a large amount of money" ;)

nigel: That's the broadcast standard
… online we deploy separate video files like Hilton
… lately starting to deploy Live descriptions... timing is an
artform.. the describers research ahead of time

cyril: is the describer local or remote?

nigel: either... third-party service

Adam_Page: English only?

nigel: yes

wschi: thinking of replicating the live AD use case into
browsers as well?

nigel: 3-4 yrs ago, demoed TTML2 with live mix instructions in
the browser
… could be used for live broadcast, too
… tech demo can mix two audio tracksusing mix data (well)
and/or less-well mixed with text-to-speech .. (generated speech
synthesis)
… could deploy as sADM, or we may deploy as a custom
implementation in the BBC player

niko: NGA can include AD, and spatial position to separate
Object-based audio

Object-based audio

bernd: demo and discussed in the Media and Entertainment
Interest Group this past Monday....

eric_carlson: WebKit eng at Apple, inc TT

jernoble: Jer Noble... WebKit Engineer at Apple, and TT

cyril: how widely deployed is AD around the world?
… Are there countries with no AD? distribution?

nigel: BBC audio describes over 20% of our programmes,
regulatory requirement is 10%.
… other countries do some percentage

jcraig: smaller percentages

cyril: is AD deployed widely in Japan?

Hiroshi_Ohta: audio subchannels are popular in Japan... for
example, background data about baseball players during
games.... Not as widely deployed for AD for the Blind

nigel: most recent olympic games in Japan included additional
data (NHK?) on the subchannel generating AD about the scores

??: not sure of which subchannels are auto-generated or not?

nigel: one development in AD that has been gaining in use is
synthesized voices
… there is an advocacy group in UK that has been running user
test experiments
… Royal National Institute for the Blind (RNIB)

Nigel: new attendee?

dana: Hi!. I'm Dana. I work on WebKit.

jcraig: Deployments in Japan: more common with streaming
services.
… Japanese is one of the languages that Apple Original content
localises to
… HBO is starting to ramp up as well, and starting to lead the
way with signed / PiP / ASL movies,
… being deployed as separate video files because there isn't a
way to compose the dry components
… and keep them in sync

nigel: on the tangent of sign interpretation, there is a new
regulatory requirement in Spain
… 3cat (Catalonian broadcaster) recently demoed an HTTBV
receiver?
… Got the signing stream over IP and recomposited,
implementation in WASM
… I think the resolution of the signed video was lower than the
main broadcast video

jcraig: resolution does not need to be as high, but high
framerate is critical with sign language... easy to lose
context with dropped frames

DAPT

nigel: this spec has originations back a few years
… TTML2 could trigger audio playback, pitch, etc, audio mixing
etc
… but in general TTML is a TT format... I tried to do an AD
variant, but had not as broad uptake
… so DAPT is AD plus mainstream dubbing as use cases
… and other uses
… thinking of production workflows, video will be commissioned
and produced... Loc and AD comes later as a second step
… usually need a transcript.... for SDH subtitles or localized
translation subtitles
… cyril said these processes are sometimes too removed, and the
dubbing plus translation can be mismatched
… trying to convince content producers to move the
transcription process earlier in the chain
… a lot of the service providers use proprietary tools

<Zakim> jcraig, you wanted to mention FCC DAC report

jcraig: I can share afterwards - I'm Apple's rep to the FCC
disability advisory committee
… and worked on a report for the commission with other people,
which is public, I'll share the link later,
… which is effectively guidelines and recommendations for
broadcasters and streamers for how
… to do exactly this, and which specific resources should be
deployed widely with the original video.
… A lot of time the contracts for production do not include the
accessible alternatives, for subtitles,
… descriptions, translations etc.
… So then when the content goes to cable providers etc., the
recommendations talk about this
… particular item, the audio description transcript and ideally
timing, as well as the dialogue,
… should as much as possible be considered and distributed by
the original distributor, to avoid this mismatch.
… Which avoids rework and mismatch when there's already prior
work that's been done.
… Redub with different transcripts etc cause those problems.

nigel: Chris intro?

cpn: Chris Needham: BBC, Chair Media WG

nigel: broadly speaking, DAPT useful as a production tool
… for Timed Text, audio, etc

<ray-schwartz9> Need to head to another meeting. Thanks for
letting me sit in!

nigel: mostly upstream of something that would go to the client
devices, but DAPT could go directly to the player, .. for
braille or TTS, local audio mix, etc.
… including pans, levels, etc

nigel: intro?

youenn: Youenn Fablet, Apple

nigel: doc includes examples to help people understand the use
cases
… tracking for translation,the current language and original
lang ("pivot languages?")

ex: norwegian to hebrew.... probably passed through English as
a pivot language
… so by tracking through this, you may have a better idea of
how to avoid or correct mistranslations
… metadata describes characters (the type portrayed by actors)
and other info
… metadata could differentiate visual description vs
transliteration of text rendered visually on screen... (time or
location chyrons, as an example)
… [scanning through the document]... showing timed text example
of AD... along with mixing data
… also can include prerecorded audio
… [showing Gain attribute data]
… result is that it ducks the main program audio while AD mix
is played, and re-raises the gain after the "ducking"

<Zakim> jcraig, you wanted to ask about ducking prefs

jcraig: Screen readers often have a setting for ducking audio,
not used when there's pre-mixed audio
… Is there more data here than just the gain, like a context,
like "this is a ducking transition",
… because that would potentially allow the user preference for
ducking.
… Is there semantic information about why the transitions are
happening?

cyril: I don't think we thought about that use case, semantic
signalling,
… but I see that it could easily be added - TTML is easily
extended, either in the spec or with
… proprietary information.

jcraig: Talking about sub-channel audio for a baseball game
earlier, some people might want
… to hear that in the same room as others who do not.
… That mixed data could be deployed to a different channel or
speaker.
… That type of semantic metadata could also apply.

rwarren2: A friend who has gone blind late in his life: enjoys
baseball,
… but now it's not on the radio, there's a change in the
announcement style
… It's frustrating because they no longer know what the action
is, because the assumption
… is that you can see what's happening.

jcraig: Anecdotally, I have a lot of blind friends into
baseball, who would like that. My assumption is because the
position don't move, and you can build a mental picture based
on action that is described well, like it used to be on radio.

nigel: Irish commission researched about appropriate ducking
levels based on how loud the program audio is, how much to
duck, and how loud the AD should be.

[8]Investigating a Standardised Approach to Setting Audio
Description Dip Values

[8] https://www.cnam.ie/wp-content/uploads/2023/06/20230619_Investigating-a-Standardised-Approach-to-Setting-Audio-Description-Dip-Values_vFINAL-1.pdf

nigel: so that the background programme sound does not drown
out the description
… so "one size" does not "fit all" when it comes to audio
ducking

<Zakim> nigel, you wanted to react to jcraig to answer that

nigel: anecdotal data point, visited VRT in Belgium would hand
tweaking gain to allow un-ducking relevant noise ("door
opening") during AD dialog, to improve understandability

wschi: how do you stream the XML?

nigel: could be one big file...
… Or MPEG-DASH, HLS, etc.

<wschi> ST2110-43#

cyril: RTP payload ST2110-43

wschi: could be very high bitrate?

cyril: might be similar to a lower bitrate for voice-only (not
full mix)

nigel: which options would we need?

jcraig: saw one anti-pattern with a streamer who deployed Dolby
Atmos, bt the AD track was flattend mixed to Stereo

Nigel: was on AD examples... There are also Dubbing example

cyril: focus on AD to ask for feedback?

nigel: structure includes data model separate from the TTML
… recording or synthesized, with optional mix data
… within the spec, each class or object type is described... no
need to have a full understanding of TTML to understand it.

cyril: request feedback on AD... are there use cases not
included? identifying gaps, etc?

wschi: very expressive about audio features... are there
interactive (user pref) controls about how that mix would work?

jcraig: Games are often very customisable, different sliders
for different game sound effects.
… Even tweaks for things that might be considered triggers or
scare warnings,
… that level of distinction.
… All custom, but deployed because users are asking for those
features.

nigel: implementations... authoring... conversion tools, etc
… expecting more activity in order to meet the goals of the
community need

wschi: re: deployment, is NGA not there yet?

nigel: not dependent on the format...

nigel: perhaps URI or fragment id for this?

cyril: I don't think there is a standard in ISOBMFF? to spec a
subtrack of a subtrack?

jernoble: for HLS, there are variants , not really tracks...

Nigel: tech discussion should continue into the hallway

cyril: please review and provide feedback

nigel: also discussing related topics tomorrow
… hope to get to CR soon

jcraig: The FCC Disability Advisory Committee (DAC) report on
"Audio Description File Transmittal for Internet Protocol
Delivered Video Programming" [9]https://www.fcc.gov/ecfs/

document/10208388924441/1
… Word/PDF/.. PDF linked from Word/PDF/.. Word/PDF linked
from/PDF linked from under the Recommendations heading:
[10]https://www.fcc.gov/audio-description

… most relevant, the section as the end on "Potential
Opportunities in the Audio Description Ecosystem for
Participants and the
… Commission" covers recommendations like:
… - Encourage vendors to provide and content creators to
request AD scripts with timestamps in addition to the AD audio
files.
… - Encourage vendors to deliver these unmixed [AD] audio files
to stakeholders.

[9] https://www.fcc.gov/ecfs/document/10208388924441/1

[10] https://www.fcc.gov/audio-description

Meeting Close

nigel: Thank you everyone, very interesting discussion points,
we're out of time [adjourns meeting]

Minutes manually created (not a transcript), formatted by
[11]scribe.perl version 229 (Thu Jul 25 08:38:54 2024 UTC).

[11] https://w3c.github.io/scribe2/scribedoc.html

Received on Thursday, 26 September 2024 18:35:30 UTC