- From: Nigel Megitt <nigel.megitt@bbc.co.uk>
- Date: Thu, 26 Sep 2024 18:35:14 +0000
- To: "public-tt@w3.org" <public-tt@w3.org>, "public-audio-description@w3.org" <public-audio-description@w3.org>
- Message-ID: <EF9A9927-C634-4D7E-B455-283B52008672@bbc.co.uk>
Thanks all for attending today’s joint meeting of Audio Description Community Group and Timed Text Working Group. Minutes can be found in HTML format at https://www.w3.org/2024/09/26-tt-minutes.html
In plain text:
[1]W3C
[1] https://www.w3.org/
Audio Description Community Group, Timed Text Working Group Joint
Meeting
26 September 2024
[2]Agenda. [3]IRC log.
[2] https://www.w3.org/wiki/TimedText/tpac2024#Thursday_26th_September
[3] https://www.w3.org/2024/09/26-tt-irc
Attendees
Present
Adam_Page, atsushi, bernd, Eric_Carlson, Färber,
Hiroshi_Ohta, jcraig, Jer_Noble, Nigel_Megitt, Nikolaus
Färber, ray-schwartz9, rwarren2, Wolfgang Schildbach
(observing)
Regrets
-
Chair
Nigel
Scribe
nigel, jcraig
Contents
1. [4]Introductions
2. [5]Agenda
3. [6]DAPT
4. [7]Meeting Close
Meeting minutes
Introductions
Nigel: Nigel Megitt, BBC, chair ADCG, co-chair TTWG, one of the
Editors of DAPT
ray-schwartz9: Ray Schwartz: he/him NFCU, memeber of ARIA
gabriel: eng on MS Edge, part of Web Audio
atsushi: w3c contact TTWG
<nigel> s/??/atsushi
niko: Nikolas Fairburn, Media and Entertainment Interest Group
bernd: member of Media and Entertainment Interest Group, and
WICG
<jcraig> s/??/niko/
jcraig: James Craig, Apple Accessibility, member of TTWG,
interested in audio descriptions, most active in ARIA
Adam_Page: Hilton Accessibility ARIA WG
cyril: Netflix, TTWG Co-editor
<atsushi> Hiroshi_Ohta: from LINE Yahoo Corp.
reinaldoferraz: Reinaldo Ferraz, W3C chapter Sao Paulo,
observer
.
sprang: Google Meet, Observer
<reinaldoferraz> NIC.br
wschi: Weiwei Xu, Huawei, Media Standard Department
nigel: intros ahead of schedule
… any other agenda topics?
Agenda
DAPT profiles in TTML
Authoring and production workflows for Loc and AD
no other agenda items
cyril: have others deployed AD in recent deployments
… MPEG-H, coding audio descriptions, etc
wschi: used in broadcast and to mix them in encode/decode
mixing not supported in browser yet
cyril: used in VOD
Robert Warren: niko agriculture, interest in the humanities
jcraig: To Cyril's question, Apple has a number of different
audio description features
… as a streaming service provider with Apple TV+, I have no
part of it but very proud of the work
… we do with AD and captions. Most of the Apple Original
content has 9 AD languages and 40 caption track.
… On the product side, there are a number of features related
to AD and captions that most people
… are not aware of. For example if you are blind and have a
screen reader you can choose to have
… captions Brailled or spoken live (for translation).
… Experimented with something similar for audio descriptions.
… Eric Carlson and I demoed that 2 years ago in TPAC.
… Take AD track type of the web <video> element, parsing it on
the fly and either
… speaking it or Brailling it. Silent descriptions sent live to
the Braille display.
… Someone who is deafblind could enjoy them widely.
… Not deployed widely, just a tech demo
… Love to get more interest in it, add the Braille support.
… That demo was a custom implementation of WebVTT in the video
player
Adam_Page: Hilton deploying more AD to the webiste
Adam_Page: Another data point. At Hilton, not a big video
platform,
… baking the audio track in
jcraig: Second track is standard way to deploy AD now, with a
tag saying it's AD, to support
… auto selection
… Technically just a standard audio track
Adam_Page: user chooses the preferred track
… most require extended.
jcraig: One of the things was descriptions longer than the
natural gap in the audio, e.g.
… extended descriptions, we demoed auto pause of video in the
player when that happened.
… Have not seen a lot of deployments of that.
… I think WGBH has some demos of extended description
jcraig: demo in Vancouver was extended lecture paused to
describe a chart
nigel: BBC deploys choice between pre-mixed audio trck with AD
versus w/o
… also deploying a dry AD track (not mixed with main audio)
plus mix data
… DVB-T is widely deployed and supported in the UK
… transport stream is specified in the UK's "d-book"
cyril: UK-specific technology?
nigel: yes, since 2006 or so
<rwarren2> "Widely available for a large amount of money" ;)
nigel: That's the broadcast standard
… online we deploy separate video files like Hilton
… lately starting to deploy Live descriptions... timing is an
artform.. the describers research ahead of time
cyril: is the describer local or remote?
nigel: either... third-party service
Adam_Page: English only?
nigel: yes
wschi: thinking of replicating the live AD use case into
browsers as well?
nigel: 3-4 yrs ago, demoed TTML2 with live mix instructions in
the browser
… could be used for live broadcast, too
… tech demo can mix two audio tracksusing mix data (well)
and/or less-well mixed with text-to-speech .. (generated speech
synthesis)
… could deploy as sADM, or we may deploy as a custom
implementation in the BBC player
niko: NGA can include AD, and spatial position to separate
Object-based audio
Object-based audio
bernd: demo and discussed in the Media and Entertainment
Interest Group this past Monday....
eric_carlson: WebKit eng at Apple, inc TT
jernoble: Jer Noble... WebKit Engineer at Apple, and TT
cyril: how widely deployed is AD around the world?
… Are there countries with no AD? distribution?
nigel: BBC audio describes over 20% of our programmes,
regulatory requirement is 10%.
… other countries do some percentage
jcraig: smaller percentages
cyril: is AD deployed widely in Japan?
Hiroshi_Ohta: audio subchannels are popular in Japan... for
example, background data about baseball players during
games.... Not as widely deployed for AD for the Blind
nigel: most recent olympic games in Japan included additional
data (NHK?) on the subchannel generating AD about the scores
??: not sure of which subchannels are auto-generated or not?
nigel: one development in AD that has been gaining in use is
synthesized voices
… there is an advocacy group in UK that has been running user
test experiments
… Royal National Institute for the Blind (RNIB)
Nigel: new attendee?
dana: Hi!. I'm Dana. I work on WebKit.
jcraig: Deployments in Japan: more common with streaming
services.
… Japanese is one of the languages that Apple Original content
localises to
… HBO is starting to ramp up as well, and starting to lead the
way with signed / PiP / ASL movies,
… being deployed as separate video files because there isn't a
way to compose the dry components
… and keep them in sync
nigel: on the tangent of sign interpretation, there is a new
regulatory requirement in Spain
… 3cat (Catalonian broadcaster) recently demoed an HTTBV
receiver?
… Got the signing stream over IP and recomposited,
implementation in WASM
… I think the resolution of the signed video was lower than the
main broadcast video
jcraig: resolution does not need to be as high, but high
framerate is critical with sign language... easy to lose
context with dropped frames
DAPT
nigel: this spec has originations back a few years
… TTML2 could trigger audio playback, pitch, etc, audio mixing
etc
… but in general TTML is a TT format... I tried to do an AD
variant, but had not as broad uptake
… so DAPT is AD plus mainstream dubbing as use cases
… and other uses
… thinking of production workflows, video will be commissioned
and produced... Loc and AD comes later as a second step
… usually need a transcript.... for SDH subtitles or localized
translation subtitles
… cyril said these processes are sometimes too removed, and the
dubbing plus translation can be mismatched
… trying to convince content producers to move the
transcription process earlier in the chain
… a lot of the service providers use proprietary tools
<Zakim> jcraig, you wanted to mention FCC DAC report
jcraig: I can share afterwards - I'm Apple's rep to the FCC
disability advisory committee
… and worked on a report for the commission with other people,
which is public, I'll share the link later,
… which is effectively guidelines and recommendations for
broadcasters and streamers for how
… to do exactly this, and which specific resources should be
deployed widely with the original video.
… A lot of time the contracts for production do not include the
accessible alternatives, for subtitles,
… descriptions, translations etc.
… So then when the content goes to cable providers etc., the
recommendations talk about this
… particular item, the audio description transcript and ideally
timing, as well as the dialogue,
… should as much as possible be considered and distributed by
the original distributor, to avoid this mismatch.
… Which avoids rework and mismatch when there's already prior
work that's been done.
… Redub with different transcripts etc cause those problems.
nigel: Chris intro?
cpn: Chris Needham: BBC, Chair Media WG
nigel: broadly speaking, DAPT useful as a production tool
… for Timed Text, audio, etc
<ray-schwartz9> Need to head to another meeting. Thanks for
letting me sit in!
nigel: mostly upstream of something that would go to the client
devices, but DAPT could go directly to the player, .. for
braille or TTS, local audio mix, etc.
… including pans, levels, etc
nigel: intro?
youenn: Youenn Fablet, Apple
nigel: doc includes examples to help people understand the use
cases
… tracking for translation,the current language and original
lang ("pivot languages?")
ex: norwegian to hebrew.... probably passed through English as
a pivot language
… so by tracking through this, you may have a better idea of
how to avoid or correct mistranslations
… metadata describes characters (the type portrayed by actors)
and other info
… metadata could differentiate visual description vs
transliteration of text rendered visually on screen... (time or
location chyrons, as an example)
… [scanning through the document]... showing timed text example
of AD... along with mixing data
… also can include prerecorded audio
… [showing Gain attribute data]
… result is that it ducks the main program audio while AD mix
is played, and re-raises the gain after the "ducking"
<Zakim> jcraig, you wanted to ask about ducking prefs
jcraig: Screen readers often have a setting for ducking audio,
not used when there's pre-mixed audio
… Is there more data here than just the gain, like a context,
like "this is a ducking transition",
… because that would potentially allow the user preference for
ducking.
… Is there semantic information about why the transitions are
happening?
cyril: I don't think we thought about that use case, semantic
signalling,
… but I see that it could easily be added - TTML is easily
extended, either in the spec or with
… proprietary information.
jcraig: Talking about sub-channel audio for a baseball game
earlier, some people might want
… to hear that in the same room as others who do not.
… That mixed data could be deployed to a different channel or
speaker.
… That type of semantic metadata could also apply.
rwarren2: A friend who has gone blind late in his life: enjoys
baseball,
… but now it's not on the radio, there's a change in the
announcement style
… It's frustrating because they no longer know what the action
is, because the assumption
… is that you can see what's happening.
jcraig: Anecdotally, I have a lot of blind friends into
baseball, who would like that. My assumption is because the
position don't move, and you can build a mental picture based
on action that is described well, like it used to be on radio.
nigel: Irish commission researched about appropriate ducking
levels based on how loud the program audio is, how much to
duck, and how loud the AD should be.
[8]Investigating a Standardised Approach to Setting Audio
Description Dip Values
[8] https://www.cnam.ie/wp-content/uploads/2023/06/20230619_Investigating-a-Standardised-Approach-to-Setting-Audio-Description-Dip-Values_vFINAL-1.pdf
nigel: so that the background programme sound does not drown
out the description
… so "one size" does not "fit all" when it comes to audio
ducking
<Zakim> nigel, you wanted to react to jcraig to answer that
nigel: anecdotal data point, visited VRT in Belgium would hand
tweaking gain to allow un-ducking relevant noise ("door
opening") during AD dialog, to improve understandability
wschi: how do you stream the XML?
nigel: could be one big file...
… Or MPEG-DASH, HLS, etc.
<wschi> ST2110-43#
cyril: RTP payload ST2110-43
wschi: could be very high bitrate?
cyril: might be similar to a lower bitrate for voice-only (not
full mix)
nigel: which options would we need?
jcraig: saw one anti-pattern with a streamer who deployed Dolby
Atmos, bt the AD track was flattend mixed to Stereo
Nigel: was on AD examples... There are also Dubbing example
cyril: focus on AD to ask for feedback?
nigel: structure includes data model separate from the TTML
… recording or synthesized, with optional mix data
… within the spec, each class or object type is described... no
need to have a full understanding of TTML to understand it.
cyril: request feedback on AD... are there use cases not
included? identifying gaps, etc?
wschi: very expressive about audio features... are there
interactive (user pref) controls about how that mix would work?
jcraig: Games are often very customisable, different sliders
for different game sound effects.
… Even tweaks for things that might be considered triggers or
scare warnings,
… that level of distinction.
… All custom, but deployed because users are asking for those
features.
nigel: implementations... authoring... conversion tools, etc
… expecting more activity in order to meet the goals of the
community need
wschi: re: deployment, is NGA not there yet?
nigel: not dependent on the format...
nigel: perhaps URI or fragment id for this?
cyril: I don't think there is a standard in ISOBMFF? to spec a
subtrack of a subtrack?
jernoble: for HLS, there are variants , not really tracks...
Nigel: tech discussion should continue into the hallway
cyril: please review and provide feedback
nigel: also discussing related topics tomorrow
… hope to get to CR soon
jcraig: The FCC Disability Advisory Committee (DAC) report on
"Audio Description File Transmittal for Internet Protocol
Delivered Video Programming" [9]https://www.fcc.gov/ecfs/
document/10208388924441/1
… Word/PDF/.. PDF linked from Word/PDF/.. Word/PDF linked
from/PDF linked from under the Recommendations heading:
[10]https://www.fcc.gov/audio-description
… most relevant, the section as the end on "Potential
Opportunities in the Audio Description Ecosystem for
Participants and the
… Commission" covers recommendations like:
… - Encourage vendors to provide and content creators to
request AD scripts with timestamps in addition to the AD audio
files.
… - Encourage vendors to deliver these unmixed [AD] audio files
to stakeholders.
[9] https://www.fcc.gov/ecfs/document/10208388924441/1
[10] https://www.fcc.gov/audio-description
Meeting Close
nigel: Thank you everyone, very interesting discussion points,
we're out of time [adjourns meeting]
Minutes manually created (not a transcript), formatted by
[11]scribe.perl version 229 (Thu Jul 25 08:38:54 2024 UTC).
[11] https://w3c.github.io/scribe2/scribedoc.html
Received on Thursday, 26 September 2024 18:35:30 UTC