W3C home > Mailing lists > Public > public-audio-description@w3.org > October 2018

Minutes from today's AD CG meeting

From: Nigel Megitt <nigel.megitt@bbc.co.uk>
Date: Thu, 25 Oct 2018 12:20:14 +0000
To: "public-audio-description@w3.org" <public-audio-description@w3.org>
Message-ID: <D7F7818A.6BA60%nigel.megitt@bbc.co.uk>
Thanks to all who were able to attend today's face to face (and webex) meeting of the Audio Description Community Group.

Minutes can be found in HTML format at https://www.w3.org/2018/10/25-ad-minutes.html

In text format:


      [1] http://www.w3.org/

                   Audio Description Community Group

25 Oct 2018

   See also: [2]IRC log

      [2] https://www.w3.org/2018/10/25-ad-irc


          Nigel_Megitt, Marisa_Demeglio, Eric_Carlson,
          Andreas_Tai, Masayoshi_Onishi, Matt_Simpson,
          Mark_Watson, Francois_Beaufort





     * [3]Topics
         1. [4]Introductions
         2. [5]Current and future status
         3. [6]Requirements
         4. [7]TTML2 in more detail
         5. [8]Proposed Solution
         6. [9]Implementation Experience
         7. [10]Roles, Tools, Timelines, Next Steps
         8. [11]Discussion and close
     * [12]Summary of Action Items
     * [13]Summary of Resolutions

   <scribe> scribe: nigel


   Nigel: Welcome everyone to the first face to face meeting of
   the AD CG.
   ... Run through of agenda


     [14] https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf

   Nigel: In the room we have:
   ... Nigel Megitt (BBC)

   marisa: Marisa Demeglio (DAISY consortium), in the Publishing
   WG and interested in accessibility

   ericc: Eric Carlson (Apple), on the Webkit team, mostly working
   on media in the web, and
   ... of course very interested in accessibility solutions.

   Andreas: Andreas Tai (IRT), mainly work on subtitles and
   captions and also look at other
   ... accessibility. Unfortunately not yet resources for
   dedicating time to this, but interested
   ... in the status.

   onishi: Onishi (NHK), NHK use 4K and 8K broadcast service and
   this uses TTML. I'd like
   ... to research use case for TTML.

   Matt: Matt Simpson (Red Bee), Head of Portfolio for Access
   Services, probably one of the
   ... biggest producers of audio description by volume for a
   number of clients around the world.

   Nigel: Thank you all

Current and future status

   Nigel: AD CG set up earlier in the year, we have a repo, an
   Editor, and participants.
   ... Goal: Get to good enough for Rec Track, add to TTWG Charter
   1st half 2019

   marisa: Timeline for TTML2?

   Nigel: TTML2 is in Proposed Rec status, the TTWG is targeting
   Rec publication on 13th November.
   ... The AC poll is open until 1st November. Please vote if you
   haven't already!


   Nigel: Goal: To create an open standard exchange format to
   support audio description all the way from scripting to mixing.

   ericc: You should look at what 3PlayMedia has.

   Nigel: Thanks I will
   ... Are they delivering accessible text versions of AD?

   ericc: Yes, both AD and extended, both pre-recorded and
   synthetic text, and they have
   ... a javascript based plug-in that works in modern browsers.

   Nigel: That sounds great, I didn't know about that, thank you.

   ericc: I haven't played with it much but it seems to work quite

   marisa: When you talk about an accessible text what makes it

   Nigel: It's delivered as text and the player can present it in
   an aria live region so that
   ... accessibility tools can pick it up.

   marisa: And TTML makes that happen?

   Nigel: It needs the player to make it happen.
   ... Existing Requirements - I published a wiki page of
   requirements a while back.

   [15]AD requirements

     [15] https://github.com/w3c/ttml2/wiki/Audio-Description-Requirements

   Nigel: Those requirements got some feedback which led to
   ... In particular to relate them to the W3C MAUR requirements,
   which they align with.



     [16] https://github.com/w3c/ttml2/wiki/Audio-Description-Requirements

   Nigel: Those requirements describe the process that the
   document needs to support
   ... but not the specifics of what the document itself needs to
   ... I've done a first pass review, the main body of the spec
   work would be to validate that
   ... those TTML2 feature designators are the correct set.



     [17] https://www.w3.org/community/audio-description/files/2018/10/AD-CG-F2F-2018-10-25.pdf

   Nigel: In looking at those requirements I thought there were
   some constraints to consider.
   ... Two questions from me:
   ... 1. Do we ever need to be able to have more than one
   “description” active at the same time?

   Matt: I can't see a reason for needing this - it would have to
   be a variation of the primary language.
   ... Multiple localised versions might be needed.
   ... I imagine that would be a single track per file.
   ... Yes, interesting thought.

   marisa: A variation on a use case, if you have a deaf-blind
   user who is following the
   ... captions they also need the information from the
   description and the captions.

   markw: They would have both description and captions available
   at the same time.

   Nigel: Assumptions on my part:
   ... Separate AD and captions files
   ... No AD over dialogue so not a significant issue of overlap

   marisa: If viewer needs to pause AD to read it on a braille

   Nigel: My assumption: that would also pause media.

   ericc: [nods]

   marisa: That's the trickiest use case I can think of

   Nigel: Me too

   atai: I'm not sure if immersive environments are in scope.
   ... A European project that IRT is involved with is exploring
   requirements for AD in 360º videos.
   ... I'm not sure if they implemented it, but one idea is to
   have some parts of the AD only
   ... activated if the user looks in a certain direction, so if
   this is happening in one document
   ... then there would be certain AD parts with the same timing
   but maybe not active at
   ... the same time.

   marisa: Great use case!
   ... Now a deaf blind user in a 360º is now the trickiest use
   case in the world I can think of!

   ericc: That means in addition to a time range, in the case of a
   360º video you may also
   ... want to have an additional selector for the viewport in
   which it is active.

   markw: Or the location of the object it is associated with.

   atai: This is very similar to the subtitle use case we showed
   before where you stick
   ... subtitles to a location. You need the same location
   information for AD.

   markw: The user could have selections about the width of the
   viewport they want.

   Nigel: That's a great use case - can I suggest it's a v2 thing
   based on the solution for
   ... subtitles, which we also don't know yet?

   atai: I agree the solution for subtitles should apply here.
   That makes sense, but it would be
   ... good to discuss it and understand the dependencies.
   ... I will check with the people working on this. I don't know
   any technical group working
   ... on audio description so it would be a good forum for
   working on requirements.
   ... If they want to contribute something they can post it on
   the CG reflector.

   Nigel: Good plan.
   ... Summarising, I don't think I've heard any requirement for
   multiple descriptions to be
   ... active at the same time, within a single language.
   ... My next constraint question is:
   ... Do we need to set media time ranges (clipBegin and clipEnd)
   on embedded audio?
   ... TTML2 allows audio to be embedded, but in our
   implementation work we hit a snag.
   ... applying media fragment URIs to a data URL is tricky.

   ericc: Embedding audio as text is a terrible idea.

   markw: Any reason other than the amount of data?

   ericc: You have to keep the text and the decoded audio in
   memory at the same time,
   ... which is additional overhead.
   ... Technically it should be straightforward to seek to a

   marisa: I don't want to implement it!

   ericc: It's terrible.

   atai: Is it then debatable to leave out this feature of
   embedded audio?

   Nigel: I think so, yes, the result would be that distribution
   of recorded audio would have
   ... to be additional files alongside the TTML2 file. That has
   an asset management impact,
   ... but it also seems like good practice.

   ericc: High level question: I talked with Ken Harenstein who
   does YouTube captions, last week,
   ... and he told me about 3PlayMedia. He said that from their
   research and from talking to
   ... users of audio descriptions and from talking to 3PlayMedia,
   it was his understanding that
   ... many users of audio descriptions prefer speech synthesis to
   pre-recorded because
   ... partly it allows them to set the speed like they're used to
   doing with screen readers
   ... and it made extended audio descriptions less disruptive
   because it reduces the likelihood
   ... of interrupting playback of the main resource. I wonder if
   you have heard that too and if
   ... it is true it seems that there should be information in a
   spec helping people who make
   ... these make the right kind.

   Nigel: TTML2 supports text to speech, and also players can
   switch off the audio
   ... and expose the text to screen readers instead to allow the
   user's screen reader to take
   ... over.

   marisa: I've heard that most screen readers speed up the

   markw: I've heard it works better speeding up synthesised

   marisa: Of course if there's no language support for text to
   speech then you may still
   ... need pre-recorded audio.

   atai: You may need to know how long the text to speech will
   take to author the rate correctly.

   Nigel: There's a whole other world of pain in terms of
   distributability of web voices for text to speech.

   ericc: I think the requirement is that the player pauses to
   allow for completion of the
   ... audio description, so it doesn't matter how long it takes.

   marisa: What if you're switching language of AD and some are
   more verbose than others?

   ericc: Yes, as long as the description accurately identifies
   the section of the media file
   ... that it describes then it is easy enough for the player to
   take care of, or at least it is the
   ... player's responsibility.

   markw: The player could do other things like tweaking the
   playback speed to fit.

   ericc: The Web Speech API doesn't allow access to predicting
   the duration of the speech.

   atai: Is player behaviour in scope for this document?

   ericc: Absolutely.
   ... It seems to me that it is because if you don't describe the
   behaviour of the player you
   ... are going to get different incompatible or
   non-interoperable implementations and that
   ... is an anti-goal.

   markw: You want to describe the space of possible player
   behaviours, we just need to
   ... provide the information.

   ericc: Yes, give guidelines to help implementers do the right
   thing, and people who create the descriptions.

   Nigel: I agree, this is somewhat informative relative to the
   document format, but for example
   ... our UX people suggested that users would want to direct AD
   text to a screen reader
   ... and switch off audio presentation sometimes, or at least be
   able to select that.

   marisa: Maybe have both audio and braille display to check
   spellings or do some other text-related processing.

   Nigel: Yes
   ... In terms of user preference for synthesised or pre-recorded
   speech, one data point
   ... I learned recently is that the intelligibility of
   synthesised speech degrades more quickly
   ... in the presence of ambient sounds than human speech. The
   reasons are not clear.

   markw: Suggests that some users would want to receive the AD in
   a separate earpiece
   ... from other audience members watching the same programme.

   Matt: I think this is like dubbing vs subtitling, there may be
   cultural reasons for preferences.
   ... Our experience is it is harder to automate variable reading
   rate descriptions, and we find
   ... that invaluable to squeeze a description into a short
   period or let it "breathe".
   ... It's probably down to historical experience.

   fbeaufort: I work at Google on the developer relations team.

   Nigel: Any other constraints or requirements?

   group: [silence]

TTML2 in more detail

   Nigel: [slide on Audio Model]
   ... I just added this to try to explain because I've found it
   can be tricky to get across to developers
   ... that there is an analogy with HTML/CSS and the audio model
   in TTML.

   markw: Players may or may not do this based on user preference,
   if for example someone
   ... is listening on a headset and there's main programme audio
   in the room the mixing
   ... preferences might change.
   ... [slide on the Web Audio Graph]
   ... This allows the audio mixing to happen with all the options
   that are needed in general
   ... in TTML2 - it may be that we only exercise a part of that
   solution space.

Proposed Solution

   Nigel: The solution that I'm proposing is a profile of TTML2
   ... [slide for Profile of TTML2]

   ericc: Also add that a UI should be provided for controlling
   the speed of audio descriptions

   Nigel: Yes
   ... The other things on this slide we already discussed.
   ... Is anyone thinking this is a great problem to solve but it
   should look completely different?

   ericc: Is it a goal to define a guide for how this should work
   in a web browser?

   Nigel: The TTML2 features are done in terms of Web Audio, Web
   Speech etc. so yes.
   ... The mixing might happen server side but the client side
   mixing options allow for a better
   ... range of accessible experiences.

   ericc: It seems to me that a really detailed guide to
   implementation would be the most useful thing.
   ... An explicit goal should be to help producers to create
   content in the right way, but also
   ... to help people that want to deliver that to know how to
   make it available to the people that need it.
   ... Not distribution, the playback experience.
   ... Nicely constructed audio descriptions are not useful unless
   the people that need them are
   ... able to consume them.

   Nigel: [nods]

   atai: It might be interesting to identify what is missing to
   get a good implementation in a browser
   ... environment.
   ... It might be interesting to hear how much browser
   communities are interested in that
   ... case. A possible way to do this would be to implement a
   javascript polyfill or something
   ... I'm not sure how much interest there is in native support.

   ericc: Both are extremely useful. I don't know anything about
   3PlayMedia but they have
   ... a javascript based player that uses text to speech API so
   we know that it is possible.
   ... There's is a commercial solution. We should have a
   description of ...
   ... and as a data point I was at a conference last week about
   media in the web and this was
   ... one of the breakouts, audio descriptions and extended audio
   ... It was well attended and people in the room were very
   interested in coming up with a
   ... solution that browsers could implement natively.

   Nigel: I'd love to be in touch with those people.

Implementation Experience

   Nigel: BBC implemented a prototype to support TTML2 Rec track

   [18]BBC implementation

     [18] https://bbc.github.io/Adhere/

   Nigel: The point here is that it is possible to do this with
   current browser technologies,
   ... even if there are some minor issues that I should raise as
   issues, like on Web Speech.
   ... Question: Any other implementation work, or people who
   would like to do that at this time?

   marisa: I would say no, we don't have the bandwidth but I'm
   keeping my eye on this for
   ... the long term. The use cases come up all the time from the
   APA group. I think it is
   ... on the horizon, but I can't commit to anything on the same
   timeline as this spec.

   atai: Does BBC plan to publish this software as a reference

   Nigel: I would say first we should publish as open source, and
   then allow for some
   ... scrutiny, and if people agree it's at that level then
   great. I don't think it is now.
   ... It would need more work.

   atai: The question is if the BBC could be motivated to provide
   it as a reference
   ... implementation. It would help if you have a complete
   reference implementation.

   Nigel: I would like to, but I don't think the code is good
   enough yet.
   ... I'm interested in other implementations too, for example it
   is possible that some
   ... participants in AD CG might make authoring tools.

   ericc: You should talk to 3Play also.

   Nigel: Yes, I will. It'd be great if they would join us here.

Roles, Tools, Timelines, Next Steps

   Nigel: In terms of tools, we have a GitHub repo w3c/adpt
   ... We have the reflector, and EBU has kindly offered to
   facilitate web meetings with their WebEx.
   ... [Next steps slide]

   atai: Regarding the next steps, to move over to WG and Rec
   track, does it necessarily have
   ... to end up in the TTWG? Could it be another group?
   ... Could it be somewhere else?
   ... To make sure the right set of people are involved.

   Nigel: I'm not dogmatic about this - it seems like the home of
   TTML is a good place for
   ... profiles of TTML, but if there's a better chance of getting
   to Rec doing it somewhere else
   ... then I don't mind where it happens.

   atai: One other idea: when the TTML2 feature set is there it
   may be useful to have a
   ... gap listing relative to IMSC 1.1 so that if people want to
   reuse implementations and
   ... start from IMSC 1.1 rather than TTML2 then they can see
   what they already have.

   ericc: Or which features they prefer not to use.

   Nigel: Because they had implementation difficulty?

   ericc: Yes, for example someone targeting IMSC 1.1 support, if
   you list the features that
   ... are only supported in one and not the other, it could

   Nigel: Of course the significant features in IMSC are about
   visual presentation and here
   ... we are interested in audio features, so the common core of
   timing is all that's really left.

Discussion and close

   Nigel: We've had good discussion all the way through, so thank
   you everyone.

   ericc: Defining this using those TTML2 features is interesting
   and its good.
   ... It sets a fairly high bar to implement.

   Nigel: It took a couple of weeks to implement.

   ericc: It makes me wonder if it would be possible to have
   something that is more like a
   ... minor variation in a caption format.

   Nigel: I think that's what this is.

   ericc: Except for the ability to embed audio.

   Nigel: That maybe took about half a day to implement. We could
   remove it from scope.

   atai: It would be good to know what problems there are bringing
   this to a browser environment.

   ericc: That's true. At the most basic it seems that what we
   have is some text and a range
   ... of time that it applies to in another file.

   Nigel: I'm thinking of high production values where detailed
   audio mixing is needed.

   ericc: Is that something we need for the web?

   Nigel: I am aiming for a single open standard file format that
   content producers can use
   ... all the way through from content creation to broadcast and
   web use.

   Matt: I would agree.

   markw: Thinking about our chain, we create premixed versions
   and they seem quite high
   ... quality, so this might be worth considering.

   atai: Thinking about the history of TTML, it started out as an
   authoring format and then
   ... began to be used for distribution and playback, which lead
   to IMSC. I understand the
   ... purpose for one file for the whole chain, that's perfect,
   it's ideal, we should just avoid the
   ... pitfalls.

   ericc: If the goal is to have native implementation in a
   browser it may be worth looking
   ... at the complexity with that goal in mind.
   ... If it is not a goal then that's fine, but if it is then
   keep that goal in mind.

   Nigel: I am not sure. It can be done with a polyfill but would
   browser makers like to support
   ... the primitives to allow that or to implement it natively?

   atai: The playback experience would be better natively.

   fbeaufort: If the playback was the same would you still want
   native implementation?

   Nigel: It would be great to avoid sending polyfill js to every
   page in that case, and it would
   ... make adoption easier if the page author just had to include
   a track in the video element
   ... and then it would play.

   ericc: Your polyfill is about 50KB of unminified uncompressed
   js so it's not very big.

   Nigel: Thank you everyone! [adjourns meeting]

Summary of Action Items

Summary of Resolutions

   [End of minutes]

    Minutes manually created (not a transcript), formatted by
    David Booth's [19]scribe.perl version 1.154 ([20]CVS log)
    $Date: 2018/10/25 12:16:44 $

     [19] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm

     [20] http://dev.w3.org/cvsweb/2002/scribe/



This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

Received on Thursday, 25 October 2018 12:20:46 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:03:45 UTC