Minutes from Media Timed Events Task Force call, 18 November 2019

Dear all,

The minutes from the Media Timed Events Task Force call on Monday 18th November are now available [1].

The main action items from this are to follow up with DASH-IF and with implementers, to develop the DataCue API.

Kind regards,

Chris (Co-chair, W3C Media & Entertainment Interest Group)

[1] https://www.w3.org/2019/11/18-me-minutes.html


W3C
- DRAFT -
Media and Entertainment IG - Media Timed Events TF
18 Nov 2019

Agenda

Attendees

Present
    Kaz_Ashimura, Chris_Needham, Francois_Daoust, Gary_Katsevman, Greg_Freedman, Huaqi_Shan, Pierre_Lemieux, Larry_Zhao, Nigel_Megitt

Regrets

Chair
    Chris

Scribe
    kaz, cpn

Contents

    Topics
    Summary of Action Items
    Summary of Resolutions

<kaz> scribenick: kaz

Chris: Thanks for joining the MTE TF call.

https://docs.google.com/presentation/d/1xNAUePmhN0hFi6xwmnZh3nA9EuAoSKjPnlC95SN9ptk/edit Chris's Slides

Chris: Goals, update on the current TF achievement,
.... discussion during TPAC 2019, DASH-IF coordination.
.... [Goals]
.... Three main goals. Native browser support for in-band cues, emsg is part of the MPEG-DASH spec, used in CMAF.
.... Also application generated cues, for example MPD events,
.... WebVMT use case, data cue synchronized with the location on a map.
.... Improvement of synchronization of cue DOM events triggered on the media timeline.
.... [Current status]
.... IG Note has been published (May 2019) and started the work on the Explainer at the WICG.
.... [Still to do]
.... Propose a change to TextTrackCue allowing you to create a cue with known starting time but no known end time.
.... Want to propose that web applications can set endTime to Infinity for that, and allow the endTime to be changed later.
.... Also recommend 10 ms timing accuracy for accurate caption rendering,
.... e.g., aligning captions with scene changes in the video.
.... We need to the review the DataCue API draft, it needs broader review and input,
.... e.g., from DASH-IF requirements.
.... Their event model has on-receive and on-start timing, to allow the application to fetch and prepare any needed resources.
.... Not currently supported by existing TextTrack APIs.
.... What do we actually need to specify? DataCue API, how to source emsg events, changes to HTML spec.
.... There's the existing "Sourcing in-band Media Resource Tracks..." document, could extend to describe emsg.
.... Or do we need a new separate document?
.... Would like to propose a registry with a set of format-specific specs.
.... We haven't got agreement about the approach yet, need to complete the DataCue explainer.
.... If we can do that, then the spec work can follow later.
.... In testing, I found some inconsistency with how TextTrackCue events are being fired,
.... variation between browsers. Maybe write some tests, file bugs against implementations.
.... [Media Timed Events at TPAC]
.... Here are links to the discussions we had at TPAC, including MEIG update report,
.... DataCue and "time marches on" in HTML breakout, next generation TextTrackCue breakout,
.... and TTWG/Media WG discussion. We decided in the Media WG to consider both TextTrackCue and DataCue
.... together, for overall architecture.
.... [Media TImed Events at TPAC (cont)]
.... Browser vendors supportive, overall.
.... One of the issues discussed was where the parsing of the in-band cues happens.
.... One point of view is that the user agent should do as much as possible, provide structured data to the web app,
.... If the cue format is well known. emsg is a way to carry arbitrary data, so not possible for UA to support all possible cues
.... [MPEG-DASH emsg types]
.... Here are the exsting defined DASH specific cues, maybe want native support for these,
.... leaving others to the web app to parse.
.... [Media Timed Events at TPAC (cont)] - revisited
.... What if the cue format is not known to the UA? Web application would know how to handle it, turn into structured data.
.... We got feedback on the IG Note, need to look at this again.
.... We can circulate the document back to the IG.
.... Should we combine effort with the next-gen TextTrackCue work.
.... [MPEG-DASH emsg types] - revisited
.... Mostly instructions to the media player.
.... Ad insertion cues, more expert input needed on this, how would they be handled via DataCue?
.... Open question about other event formats, can we make a list of the specific cue types that should be natively supported?
.... [Webkit supported timed metadata cues]
.... Webkit already has the DataCue API, support for timed metadata events,
.... QuickTime User Data, QuickTime Metadata, iTunes metadata, MPEG-4 metadata, ID3 metadata.
.... DataCue could support standardised and vendor-specific cues.
.... Webkit DataCue may provide what we need for emsg.
.... [Possible TextTrack architecture]
.... Cyril presented this at the Media WG, for TextTrack cue handling.
.... Media parsed to extract metadata cues, and those passed to the web application,
.... pass to the UA for synchronized rendering.
.... Would this mechanism work for us, so design APIs around it?
.... Want to discuss this more generally with implementers and the people involved in generic TextTrackCue.
.... [DASH-IF]
.... I'm hoping we can have more discussion with DASH-IF.
.... They were involved in the early stage of this TF,
.... so now is a good timing to sync up with each other.
.... [Next steps]
.... Updates to the IG note, review the DataCue API proposal,
.... develop the overall cue processing mechanism.

Kaz: What to do for the next step? Need more UCs/REQs to complete the IG note?
.... Or would ask stakeholders for review?

Chris: The latter, we have the use cases and requirements, needs stakeholder review.

Kaz: Getting right stakeholders would be the key,
.... we need to invite additional experts, right?

Chris: Right, e.g., DASH-IF people, implementers.
.... Ideally, we need an editor both for the explainer and any specs,
.... so getting other contributors is important.

<cpn> scribenick: cpn

Kaz: Also get input from members from the publishing IG?
.... They're interested in integrating media streaming, animation, etc.

<kaz> scribenick: kaz

Chris: I'm aware of the Synchronized Media for Publications CG,
.... they presented to the Media WG,
.... so would be interesting to invite them if they have a use case.

Kaz: I can invite some of them to the MEIG, and work on the MTE draft.

Chris: Yes, thank you.

Kaz: Regarding the DASH-IF guys, maybe we could invite them to the MEIG main call instead?

Chris: That's a possible option, or I can join their one of their calls as well.
.... All, please take a look at the docs linked from the slides.
.... My own time has been limited, so help would be welcome.

Kaz: I was wondering about the TexTrack architecture slide (page 10)
.... Any concrete discussion about event transaction sequence?

Chris: I think the intent is to allow correctly synchronised caption rendering, but I've not been involved in that part.

Kaz: OK, we could think about the event handling sequence after getting feedback to the current proposal.

Chris: Yes. What are the next steps for the generic TextTrackCue work?

Pierre: There are not the right stakeholders on this call today,
.... lots of enthusiasm at TPAC, though.

Chris: Yes, need to do something about that. I was wondering about the generic TextTrackCue?

Pierre: Do you mean the one implemented by Apple?

Chris: Yes, and the prototype.

Pierre: Yes, next step is to find a home for the spec and create a draft. I have not seen much progress so far.

Gary: There was an update at FOMS. Eric built out the prototype using the HTML DOM fragment,
.... looked good. It simplified a lot of code compared to the previous JSON implementation.
.... Google maybe not on board with the DOM fragment idea.

Pierre: The challenge is Chrome and Edge are looking at their bug trackers,
.... and nobody complains about this topic there.
.... The spec draft is ready for wider review.

Gary: We've prototyped it, get the spec out and get buy in.

Pierre: The spec is ready, but hard to make progress if Google oppose the approach.

Gary: The main objection is whether DOM fragment or not, not so much the concept itself.

Pierre: What was the concern? Today the APIs accepts JSON, and you get back an HTML fragment.

Gary: With the latest prototype Eric, you'd just provide a DOM fragment rather than JSON.

Pierre: That's a big shift from TPAC.
.... There was strong concern about security from the browser folks, hence it wasn't pursued.
.... If you already have all the mechanics to pass HTML, why not just pass HTML?

Gary: Eric was able to get that working, with maybe a few allowed HTML elements, div and span, a couple of allowed attributes.

Pierre: OK, thanks for the update.

Chris: For next steps, we need to get feedback from the right people.
.... Should we continue having calls? It's in the calendar, but maybe not the right time.
.... I may need to ask more specific questions to implementers and stakeholders.
.... Ideally would like to have somebody to work together on this.
.... I'll that follow up.
.... Anything else for today?

(none)

Chris: Thanks

[adjourned]

Summary of Action Items
Summary of Resolutions

[End of minutes]
Minutes formatted by David Booth's scribe.perl version 1.152 (CVS log)
$Date: 2019/11/19 17:10:18 $

Received on Monday, 25 November 2019 12:35:12 UTC