Minutes from today's AD CG meeting from Nigel Megitt on 2018-10-25 (public-audio-description@w3.org from October 2018)

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Thu, 25 Oct 2018 12:20:14 +0000
To: "public-audio-description@w3.org" <public-audio-description@w3.org>
Message-ID: <D7F7818A.6BA60%nigel.megitt@bbc.co.uk>

Thanks to all who were able to attend today's face to face (and webex) meeting of the Audio Description Community Group.

Minutes can be found in HTML format at https://www.w3.org/2018/10/25-ad-minutes.html

In text format:

[1]W3C

[1] http://www.w3.org/

Audio Description Community Group

25 Oct 2018

See also: [2]IRC log

[2] https://www.w3.org/2018/10/25-ad-irc

Attendees

Present
Nigel_Megitt, Marisa_Demeglio, Eric_Carlson,
Andreas_Tai, Masayoshi_Onishi, Matt_Simpson,
Mark_Watson, Francois_Beaufort

Regrets
John_Birch

Chair
Nigel

Scribe
nigel

Contents

* [3]Topics
1. [4]Introductions
2. [5]Current and future status
3. [6]Requirements
4. [7]TTML2 in more detail
5. [8]Proposed Solution
6. [9]Implementation Experience
7. [10]Roles, Tools, Timelines, Next Steps
8. [11]Discussion and close
* [12]Summary of Action Items
* [13]Summary of Resolutions
__________________________________________________________

<scribe> scribe: nigel

Introductions

Nigel: Welcome everyone to the first face to face meeting of
the AD CG.
... Run through of agenda

[14]Slides

[14] https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf

Nigel: In the room we have:
... Nigel Megitt (BBC)

marisa: Marisa Demeglio (DAISY consortium), in the Publishing
WG and interested in accessibility

ericc: Eric Carlson (Apple), on the Webkit team, mostly working
on media in the web, and
... of course very interested in accessibility solutions.

Andreas: Andreas Tai (IRT), mainly work on subtitles and
captions and also look at other
... accessibility. Unfortunately not yet resources for
dedicating time to this, but interested
... in the status.

onishi: Onishi (NHK), NHK use 4K and 8K broadcast service and
this uses TTML. I'd like
... to research use case for TTML.

Matt: Matt Simpson (Red Bee), Head of Portfolio for Access
Services, probably one of the
... biggest producers of audio description by volume for a
number of clients around the world.

Nigel: Thank you all

Current and future status

Nigel: AD CG set up earlier in the year, we have a repo, an
Editor, and participants.
... Goal: Get to good enough for Rec Track, add to TTWG Charter
1st half 2019

marisa: Timeline for TTML2?

Nigel: TTML2 is in Proposed Rec status, the TTWG is targeting
Rec publication on 13th November.
... The AC poll is open until 1st November. Please vote if you
haven't already!

Requirements

Nigel: Goal: To create an open standard exchange format to
support audio description all the way from scripting to mixing.

ericc: You should look at what 3PlayMedia has.

Nigel: Thanks I will
... Are they delivering accessible text versions of AD?

ericc: Yes, both AD and extended, both pre-recorded and
synthetic text, and they have
... a javascript based plug-in that works in modern browsers.

Nigel: That sounds great, I didn't know about that, thank you.

ericc: I haven't played with it much but it seems to work quite
well.

marisa: When you talk about an accessible text what makes it
accessible?

Nigel: It's delivered as text and the player can present it in
an aria live region so that
... accessibility tools can pick it up.

marisa: And TTML makes that happen?

Nigel: It needs the player to make it happen.
... Existing Requirements - I published a wiki page of
requirements a while back.

[15]AD requirements

[15] https://github.com/w3c/ttml2/wiki/Audio-Description-Requirements

Nigel: Those requirements got some feedback which led to
changes.
... In particular to relate them to the W3C MAUR requirements,
which they align with.

<marisa>
[16]https://github.com/w3c/ttml2/wiki/Audio-Description-Require

ments

[16] https://github.com/w3c/ttml2/wiki/Audio-Description-Requirements

Nigel: Those requirements describe the process that the
document needs to support
... but not the specifics of what the document itself needs to
support.
... I've done a first pass review, the main body of the spec
work would be to validate that
... those TTML2 feature designators are the correct set.

<ericc>
[17]https://www.w3.org/community/audio-description/files/2018/1

0/AD-CG-F2F-2018-10-25.pdf

[17] https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf

Nigel: In looking at those requirements I thought there were
some constraints to consider.
... Two questions from me:
... 1. Do we ever need to be able to have more than one
“description” active at the same time?

Matt: I can't see a reason for needing this - it would have to
be a variation of the primary language.
... Multiple localised versions might be needed.
... I imagine that would be a single track per file.
... Yes, interesting thought.

marisa: A variation on a use case, if you have a deaf-blind
user who is following the
... captions they also need the information from the
description and the captions.

markw: They would have both description and captions available
at the same time.

Nigel: Assumptions on my part:
... Separate AD and captions files
... No AD over dialogue so not a significant issue of overlap

marisa: If viewer needs to pause AD to read it on a braille
display...

Nigel: My assumption: that would also pause media.

ericc: [nods]

marisa: That's the trickiest use case I can think of

Nigel: Me too

atai: I'm not sure if immersive environments are in scope.
... A European project that IRT is involved with is exploring
requirements for AD in 360º videos.
... I'm not sure if they implemented it, but one idea is to
have some parts of the AD only
... activated if the user looks in a certain direction, so if
this is happening in one document
... then there would be certain AD parts with the same timing
but maybe not active at
... the same time.

marisa: Great use case!
... Now a deaf blind user in a 360º is now the trickiest use
case in the world I can think of!

ericc: That means in addition to a time range, in the case of a
360º video you may also
... want to have an additional selector for the viewport in
which it is active.

markw: Or the location of the object it is associated with.

atai: This is very similar to the subtitle use case we showed
before where you stick
... subtitles to a location. You need the same location
information for AD.

markw: The user could have selections about the width of the
viewport they want.

Nigel: That's a great use case - can I suggest it's a v2 thing
based on the solution for
... subtitles, which we also don't know yet?

atai: I agree the solution for subtitles should apply here.
That makes sense, but it would be
... good to discuss it and understand the dependencies.
... I will check with the people working on this. I don't know
any technical group working
... on audio description so it would be a good forum for
working on requirements.
... If they want to contribute something they can post it on
the CG reflector.

Nigel: Good plan.
... Summarising, I don't think I've heard any requirement for
multiple descriptions to be
... active at the same time, within a single language.
... My next constraint question is:
... Do we need to set media time ranges (clipBegin and clipEnd)
on embedded audio?
... TTML2 allows audio to be embedded, but in our
implementation work we hit a snag.
... applying media fragment URIs to a data URL is tricky.

ericc: Embedding audio as text is a terrible idea.

markw: Any reason other than the amount of data?

ericc: You have to keep the text and the decoded audio in
memory at the same time,
... which is additional overhead.
... Technically it should be straightforward to seek to a
point.

marisa: I don't want to implement it!

ericc: It's terrible.

atai: Is it then debatable to leave out this feature of
embedded audio?

Nigel: I think so, yes, the result would be that distribution
of recorded audio would have
... to be additional files alongside the TTML2 file. That has
an asset management impact,
... but it also seems like good practice.

ericc: High level question: I talked with Ken Harenstein who
does YouTube captions, last week,
... and he told me about 3PlayMedia. He said that from their
research and from talking to
... users of audio descriptions and from talking to 3PlayMedia,
it was his understanding that
... many users of audio descriptions prefer speech synthesis to
pre-recorded because
... partly it allows them to set the speed like they're used to
doing with screen readers
... and it made extended audio descriptions less disruptive
because it reduces the likelihood
... of interrupting playback of the main resource. I wonder if
you have heard that too and if
... it is true it seems that there should be information in a
spec helping people who make
... these make the right kind.

Nigel: TTML2 supports text to speech, and also players can
switch off the audio
... and expose the text to screen readers instead to allow the
user's screen reader to take
... over.

marisa: I've heard that most screen readers speed up the
speech.

markw: I've heard it works better speeding up synthesised
speech

marisa: Of course if there's no language support for text to
speech then you may still
... need pre-recorded audio.

atai: You may need to know how long the text to speech will
take to author the rate correctly.

Nigel: There's a whole other world of pain in terms of
distributability of web voices for text to speech.

ericc: I think the requirement is that the player pauses to
allow for completion of the
... audio description, so it doesn't matter how long it takes.

marisa: What if you're switching language of AD and some are
more verbose than others?

ericc: Yes, as long as the description accurately identifies
the section of the media file
... that it describes then it is easy enough for the player to
take care of, or at least it is the
... player's responsibility.

markw: The player could do other things like tweaking the
playback speed to fit.

ericc: The Web Speech API doesn't allow access to predicting
the duration of the speech.

atai: Is player behaviour in scope for this document?

ericc: Absolutely.
... It seems to me that it is because if you don't describe the
behaviour of the player you
... are going to get different incompatible or
non-interoperable implementations and that
... is an anti-goal.

markw: You want to describe the space of possible player
behaviours, we just need to
... provide the information.

ericc: Yes, give guidelines to help implementers do the right
thing, and people who create the descriptions.

Nigel: I agree, this is somewhat informative relative to the
document format, but for example
... our UX people suggested that users would want to direct AD
text to a screen reader
... and switch off audio presentation sometimes, or at least be
able to select that.

marisa: Maybe have both audio and braille display to check
spellings or do some other text-related processing.

Nigel: Yes
... In terms of user preference for synthesised or pre-recorded
speech, one data point
... I learned recently is that the intelligibility of
synthesised speech degrades more quickly
... in the presence of ambient sounds than human speech. The
reasons are not clear.

markw: Suggests that some users would want to receive the AD in
a separate earpiece
... from other audience members watching the same programme.

Matt: I think this is like dubbing vs subtitling, there may be
cultural reasons for preferences.
... Our experience is it is harder to automate variable reading
rate descriptions, and we find
... that invaluable to squeeze a description into a short
period or let it "breathe".
... It's probably down to historical experience.

fbeaufort: I work at Google on the developer relations team.

Nigel: Any other constraints or requirements?

group: [silence]

TTML2 in more detail

Nigel: [slide on Audio Model]
... I just added this to try to explain because I've found it
can be tricky to get across to developers
... that there is an analogy with HTML/CSS and the audio model
in TTML.

markw: Players may or may not do this based on user preference,
if for example someone
... is listening on a headset and there's main programme audio
in the room the mixing
... preferences might change.
... [slide on the Web Audio Graph]
... This allows the audio mixing to happen with all the options
that are needed in general
... in TTML2 - it may be that we only exercise a part of that
solution space.

Proposed Solution

Nigel: The solution that I'm proposing is a profile of TTML2
... [slide for Profile of TTML2]

ericc: Also add that a UI should be provided for controlling
the speed of audio descriptions

Nigel: Yes
... The other things on this slide we already discussed.
... Is anyone thinking this is a great problem to solve but it
should look completely different?

ericc: Is it a goal to define a guide for how this should work
in a web browser?

Nigel: The TTML2 features are done in terms of Web Audio, Web
Speech etc. so yes.
... The mixing might happen server side but the client side
mixing options allow for a better
... range of accessible experiences.

ericc: It seems to me that a really detailed guide to
implementation would be the most useful thing.
... An explicit goal should be to help producers to create
content in the right way, but also
... to help people that want to deliver that to know how to
make it available to the people that need it.
... Not distribution, the playback experience.
... Nicely constructed audio descriptions are not useful unless
the people that need them are
... able to consume them.

Nigel: [nods]

atai: It might be interesting to identify what is missing to
get a good implementation in a browser
... environment.
... It might be interesting to hear how much browser
communities are interested in that
... case. A possible way to do this would be to implement a
javascript polyfill or something
... I'm not sure how much interest there is in native support.

ericc: Both are extremely useful. I don't know anything about
3PlayMedia but they have
... a javascript based player that uses text to speech API so
we know that it is possible.
... There's is a commercial solution. We should have a
description of ...
... and as a data point I was at a conference last week about
media in the web and this was
... one of the breakouts, audio descriptions and extended audio
descriptions.
... It was well attended and people in the room were very
interested in coming up with a
... solution that browsers could implement natively.

Nigel: I'd love to be in touch with those people.

Implementation Experience

Nigel: BBC implemented a prototype to support TTML2 Rec track
work

[18]BBC implementation

[18] https://bbc.github.io/Adhere/

Nigel: The point here is that it is possible to do this with
current browser technologies,
... even if there are some minor issues that I should raise as
issues, like on Web Speech.
... Question: Any other implementation work, or people who
would like to do that at this time?

marisa: I would say no, we don't have the bandwidth but I'm
keeping my eye on this for
... the long term. The use cases come up all the time from the
APA group. I think it is
... on the horizon, but I can't commit to anything on the same
timeline as this spec.

atai: Does BBC plan to publish this software as a reference
implementation?

Nigel: I would say first we should publish as open source, and
then allow for some
... scrutiny, and if people agree it's at that level then
great. I don't think it is now.
... It would need more work.

atai: The question is if the BBC could be motivated to provide
it as a reference
... implementation. It would help if you have a complete
reference implementation.

Nigel: I would like to, but I don't think the code is good
enough yet.
... I'm interested in other implementations too, for example it
is possible that some
... participants in AD CG might make authoring tools.

ericc: You should talk to 3Play also.

Nigel: Yes, I will. It'd be great if they would join us here.

Roles, Tools, Timelines, Next Steps

Nigel: In terms of tools, we have a GitHub repo w3c/adpt
... We have the reflector, and EBU has kindly offered to
facilitate web meetings with their WebEx.
... [Next steps slide]

atai: Regarding the next steps, to move over to WG and Rec
track, does it necessarily have
... to end up in the TTWG? Could it be another group?
... Could it be somewhere else?
... To make sure the right set of people are involved.

Nigel: I'm not dogmatic about this - it seems like the home of
TTML is a good place for
... profiles of TTML, but if there's a better chance of getting
to Rec doing it somewhere else
... then I don't mind where it happens.

atai: One other idea: when the TTML2 feature set is there it
may be useful to have a
... gap listing relative to IMSC 1.1 so that if people want to
reuse implementations and
... start from IMSC 1.1 rather than TTML2 then they can see
what they already have.

ericc: Or which features they prefer not to use.

Nigel: Because they had implementation difficulty?

ericc: Yes, for example someone targeting IMSC 1.1 support, if
you list the features that
... are only supported in one and not the other, it could
inform.

Nigel: Of course the significant features in IMSC are about
visual presentation and here
... we are interested in audio features, so the common core of
timing is all that's really left.

Discussion and close

Nigel: We've had good discussion all the way through, so thank
you everyone.

ericc: Defining this using those TTML2 features is interesting
and its good.
... It sets a fairly high bar to implement.

Nigel: It took a couple of weeks to implement.

ericc: It makes me wonder if it would be possible to have
something that is more like a
... minor variation in a caption format.

Nigel: I think that's what this is.

ericc: Except for the ability to embed audio.

Nigel: That maybe took about half a day to implement. We could
remove it from scope.

atai: It would be good to know what problems there are bringing
this to a browser environment.

ericc: That's true. At the most basic it seems that what we
have is some text and a range
... of time that it applies to in another file.

Nigel: I'm thinking of high production values where detailed
audio mixing is needed.

ericc: Is that something we need for the web?

Nigel: I am aiming for a single open standard file format that
content producers can use
... all the way through from content creation to broadcast and
web use.

Matt: I would agree.

markw: Thinking about our chain, we create premixed versions
and they seem quite high
... quality, so this might be worth considering.

atai: Thinking about the history of TTML, it started out as an
authoring format and then
... began to be used for distribution and playback, which lead
to IMSC. I understand the
... purpose for one file for the whole chain, that's perfect,
it's ideal, we should just avoid the
... pitfalls.

ericc: If the goal is to have native implementation in a
browser it may be worth looking
... at the complexity with that goal in mind.
... If it is not a goal then that's fine, but if it is then
keep that goal in mind.

Nigel: I am not sure. It can be done with a polyfill but would
browser makers like to support
... the primitives to allow that or to implement it natively?

atai: The playback experience would be better natively.

fbeaufort: If the playback was the same would you still want
native implementation?

Nigel: It would be great to avoid sending polyfill js to every
page in that case, and it would
... make adoption easier if the page author just had to include
a track in the video element
... and then it would play.

ericc: Your polyfill is about 50KB of unminified uncompressed
js so it's not very big.

Nigel: Thank you everyone! [adjourns meeting]

Summary of Action Items

Summary of Resolutions

[End of minutes]
__________________________________________________________

Minutes manually created (not a transcript), formatted by
David Booth's [19]scribe.perl version 1.154 ([20]CVS log)
$Date: 2018/10/25 12:16:44 $

[19] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm

[20] http://dev.w3.org/cvsweb/2002/scribe/

----------------------------

http://www.bbc.co.uk

This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

---------------------

Received on Thursday, 25 October 2018 12:20:46 UTC