Minutes from Media Timed Events Task Force call 21 September 2020

Dear all,

The minutes from yesterday's Media Timed Events / WICG DataCue API call are now available [1], and copied below.

There were some action items:

1. Chris to help Rob file browser issues for unbounded cue time
2. Chris and Iraj to follow up with DASH-IF
3. Chris and Eric to refine DataCue API proposal to use a hint for early event delivery

Kind regards,

Chris (Co-chair, W3C Media & Entertainment Interest Group)

[1] https://www.w3.org/2020/09/21-me-minutes.html

W3C
- DRAFT -
Media Timed Events TF
22 Sep 2020
Agenda

Attendees

Present
    Charles_Lo, Chris_Needham, Rob_Smith, Cyril_Concolato, Kaz_Ashimura, Mark_Vickers, Eric_Carlson, Franco_Ghilardi, Iraj_Sodagar, Yasser_Syed

Regrets

Chair
    Chris

Scribe
    cpn, kaz

Contents

Topics
    Agenda
    TextTrackCue end time
    TPAC plans
    Synchronized rendering

Summary of Action Items

Summary of Resolutions

<cpn> scribenick: cpn

# Agenda

Chris: Agenda: https://lists.w3.org/Archives/Public/public-web-and-tv/2020Sep/0005.html

# TextTrackCue end time

Chris: PR is ready, awaiting feedback from browser implementers

https://github.com/whatwg/html/issues/5297 Issue 5297

Rob: How to fill in the WHATWG issue template, and get feedback?

Eric: People frequently @-mention the people likely to have input, also to file issues in the browser issue trackers, and like to those from the issue

Rob: So should I raise the PR and link to the filed bugs?

Eric: Helps to remind people it's an open issue, I don't expect push back, as there wasn't any when you discussed in the previous meeting. I think it's about getting people's attention, so they make the time to comment

Chris: I can help with filing the issues

# TPAC plans

Chris: Proposed joint meeting between this group, TTWG, and Media WG, to cover a few topics
.... Meeting planned for Thursday 15 Oct: https://www.w3.org/2011/webtv/wiki/TPAC_2020_meeting#Thursday_15_October_2020:_Timed_Text_WG_.2F_Media_WG_.2F_Media_.26_Entertainment_IG_Joint_Meeting

Rob: I'll propose a breakout on exporting video metadata for moving objects and sensors, joins up with OGC16 testbed work. Taking inband data in MISB format (an American military based standard for drones), has location specific information.
.... I've written a parser for the MPEG2-TS to extract the data to WebVMT. This makes it more accessible. Have a demo, the parser works. It takes significant time, but it shows the pain points of parsing the media to extract the data
.... Two aspects: moving objects, timed location sequence over a geospatial landscape, heading information. What's the API for exposing to the web? Also sensor data at different rates, what do do inbetween sample points? Discrete - should data be interpolated?
.... I'll raise GitHub issues at WebVMT, and discuss in a TPAC breakout. Anyone here is invited too, want input from media specialists, e.g., on advantages on inband vs out of band
.... I'll post details shortly.

Eric: I'd be interested to see a description of how the data packaged in the stream.

<RobSmith> https://webvmt.org/webvmt/blog

Chris: You mentioned MPEG2-TS, are they considering MP4?

Rob: It may be in progress, that's good feeback.

Kaz: Thank you Rob. This proposal is very interesting from a WoT viewpoint as well, would also like to suggest participating in the WoT meetings. I can message you separately.

<RobSmith> MISB 0601: https://www.gwg.nga.mil/misb/docs/standards/ST0601.16a.pdf

<RobSmith> MISB 0903: https://www.gwg.nga.mil/misb/docs/standards/ST0903.5.pdf

<RobSmith> 0601 covers platform & sensor locations, 0903 covers target tracking

# Synchronized rendering

Chris: What kind of app behaviours do we want in response to cue events?
.... One that comes up frequently is rendering of overlaid web content that's closely synchronized to the video.
.... If we use the cue enter or cuechange event, then any rendering by the application will occur later than the cue start time.
.... The DASH-IF have defined an 'on-receive' mode for this scenario, where the web app would have visibility of the cue ahead of the event start time.
.... This allows it to prepare any resources needed.
.... https://docs.google.com/presentation/d/1Oir_gRhleMSpR850KZlxnz20xnvYnJoNk-ZlsMVrbIY/edit (see slide 3 and slide 6)

<kaz> scribenick: kaz

(diagram on "Simplified MSE/EME pipeline")

Chris: Slide 6 explains "Possible DataCue pipeline (in-band metadata)"
.... Some API yet to be defined, presents the cue event or timed metadata to the web application
.... There would then be a scheduling API for the application to tell the browser when to render the content.
.... The goal is to bring the media and web rendering pipelines closer together, from a synchronization perspective.
.... Other proposal coming along could allow for better synchronized rendering.
.... For example Web Codecs moves the player into application JavaScript, with the rendering timing controlled from script.
.... The main limitation is that you would not be able to use any protected content (at the moment)
.... The DataCue API is a part of the overall solution - providing the event information to the web application,
.... but there is a missing part, which is synchronized rendering.

Eric: Back to the web app wanting get early access to the cue, I don't want to add another event for this.
.... Most of the time it isn't needed, but you always have the extra overhead.
.... So I wonder about providing a hint for earlier delivery, instead of adding an extra event.
.... You could give a lead time. It won't always be possible to deliver the data early.
.... In Webkit, I get inband cues of all types, but don't have control - typically they are delivered just in time.

Rob: Another key thing is how much data you need to download, not just the time.
.... It depends on what device you use and the network connection.

Chris: In the prerecorded case, you know in advance.

Rob: You can skip forward, but difficult in live streaming.

Chris: Without this kind of approach, timing synchronization is best-effort..
.... Interested to hear opinions.

Iraj: I'm active in MPEG and MPEG-DASH.
.... The way we designed for MPEG-DASH was two different dispatching mechanisms, on-start and on-receive.
.... The client subscribes to receive events. DASH is just a delivery mechanism.
.... The Web application can subscribe to event streams.
.... Advantage is if the content author has advance knowledge, it can signal that the event will start in 5 seconds time.
.... The value is that you don't need to process the event. In the case where the web uses on-start, it could be too late.

Eric: The client can subscribe more than one stream?

Iraj: Yes, for any event stream, it's either on-start or on-receive, but not both.
.... The problem with a hint is that I don't know how far ahead, so may not always be possible.

Eric: I can see how that could be useful,
.... but don't think it would make sense to have a different event.

Cyril: I would agree with Eric, there's an analogy with CPB removal delay, for handling video frames, it seems similar.
.... I don't understand why we need an additional event in this case.

Iraj: When you say "hint", what do you mean, Eric?

Eric: I'm imagining some attribute on a TextTrack, a number that tells the UA that the web app wants to see the events early, if possible.
.... Rather than having two different events, having one specific event and an additional attribute as a hint.

Chris: We're overloading the "event" terminology: there is the timed event data in the media, and the JavaScript cue enter/exit events.

Eric: I don't think it would make sense to have a global property, so per-track.

Chris: Is there a similarity here witht the proposed new TextTrackCue API, where the UA does the rendering rather than the web app, for captions?

Eric: I just mean it would set the hint on the data cue track.

Chris: So with the hint, we'd be changing the behaviour of the "time marches on" steps.

Cyril: The hint could be set in the media itself.

Eric: In that case could be a reflected attribute.
.... The value of text track attribute would be set initially by the content..

<cpn> scribenick: cpn

Kaz: Can you think of some more detail example, where the hint could be used?
.... There are so many various possibilities and cases for this, so maybe we need to think of which examples need such an advanced mechanism?

<kaz> scribenick: kaz

<cpn> https://github.com/WICG/datacue/blob/master/requirements.md

Chris: We have captured requirements described above,
.... Including SCTE cues which would be used for client-side ad insertion, to switch media playback,
.... as well as the synchronized overlay rendering we've been talking about.
.... I'd like to clarify the use cases for the client-side ad insertion case, how would an application respond?
.... Does anybody have time to help me work on that ahead of the TPAC joint meeting?

Iraj: DASH-IF event TF can help, yes.

Chris: Goal is to update the description by Oct 15.

Kaz: The next call won't be 19th?

Chris: Oct 15 will be the joint call, and then Oct 19 is the MEIG meeting.
.... The next MTE TF call would be Nov 16, as a follow up to the TPAC discussion.

Eric: I have limited time but happy to help via email.

Chris: Thanks!

[adjourned]

Summary of Action Items
Summary of Resolutions
[End of minutes]
Minutes formatted by David Booth's scribe.perl version 1.152 (CVS log)
$Date: 2020/09/22 11:51:59 $

Received on Tuesday, 22 September 2020 16:52:43 UTC