RE: Minutes from Media Timed Events Task Force call, 18 February 2018

Thanks Chris,

I would have added a point about the synchronisation discussion had I been able to attend: the idea of "sync to next frame" sounds appealing but breaks down if the video frame rate is too low. I think that's an argument already made in the document. In that case synchronising to a time is what is actually needed.

This isn't even an edge case by the way - there are adaptive streaming profiles that go down to 6.25fps for video though the audio is continuous at whatever sample frequency is used, 44.1kHz I guess. If you quantise subtitle times authored against 25fps to frame boundaries then you get back to the +/- 250ms problem! So in that case it's the wrong answer.

There may be a middle ground if there is a way to request presentation frames more frequently than the encoded video frame rate - the actual screen refresh rate will typically be higher than the encoded video frame rate, so hooking into that would sidestep the quantisation issue.

kind regards,

Nigel



________________________________________
From: Chris Needham [chris.needham@bbc.co.uk]
Sent: 20 February 2019 10:05
To: public-web-and-tv@w3.org
Subject: Minutes from Media Timed Events Task Force call, 18 February 2018

Dear all,

The minutes from Monday's Media Timed Events Task Force call are available [1], and copied below.

Kind regards,

Chris (Co-chair, W3C Media & Entertainment Interest Group)

[1] https://www.w3.org/2019/02/18-me-minutes.html

--

W3C
- DRAFT -
Media and Entertainment IG - Media Timed Events TF
18 Feb 2019

Agenda

Attendees

Present
    Kaz_Ashimura, Ali_C_Begen_Chris_Needham, Francois_Daoust, Rob_Smith, Steve_Morris

Regrets
    Nigel_Megitt

Chair
    Chris

Scribe
    kaz, tidoust

Contents

Topics
    Media Timed Events document review
    Synchronization
    Next steps

Summary of Action Items
Summary of Resolutions

<kaz> scribenick: kaz

Chris: I'll give an update on our use cases/requirements document for publication,
.... and review synchronization gap analysis.
.... Unfortunately Nigel can't join today but he has sent some input, which I want to review.

# Media Timed Events document review

https://w3c.github.io/me-media-timed-events/ draft

Chris: Reviewing recent changes, based on feedback from Mark.
.... In the Introduction I used the text from the explainer document.
.... Most recently, got comment from Francois about adding "media timed events" to Terminology.
.... Nigel sent feedback that it's not just metadata events, but also accessibility related events,
.... such as timed text and audio description. He has just raised an issue.

https://github.com/w3c/me-media-timed-events/issues GH issues

Chris: I've reworded some of the use cases.
.... 3.1 Dynamic content insertion has been added, important driver for DataCue.
.... 3.2 Audio stream with titles and images, hasn't changed really.
.... 3.3 Control messages for media streaming clients, I renamed this section to make it more consistent
.... I plan to remove the Editor's note (See also this issue against...)

https://github.com/w3c/webmediaguidelines/issues/64 issue 64

Chris: There's some useful text in that issue about deep parsing that we could add.

<scribe> ACTION: Chris to remove Editor's note and add text regarding deep parsing to the document.

Chris: 3.4 Subtitle and caption rendering. I removed detail here to focus this.

https://github.com/w3c/me-media-timed-events/issues/36 Nigel's issue 36

Chris: I would address Nigel's issue in this section.

<scribe> ACTION: Nigel/Chris to add word rate to section 3.4 from issue 36.

Chris: 3.5 Synchronized map animations hasn't changed recently.
.... 3.6 Media analysis visualization, maybe we don't need this use case,
.... it's quite niche, but shows the utility of a generic DataCue API.
.... Can remove if it doesn't add anything beyond the other use cases.

<scribe> ACTION: Review media analysis visualization use case and decide whether to include.

Chris: 3.7 Presentation of auxiliary content in live media.
.... This is the one Giri put forward, Mark commented that it wasn't clear.
.... We should get back to Giri and ask clarification.
.... This seems to be related to interruptions in a live media stream,
.... could happen at any time, but not necessarily scheduled by the content provider.

<scribe> ACTION: Chris to contact Giri to clarify section 3.7.

Chris: 4. Related industry specifications.
.... I added CMAF as the first section here.
.... [Reviews the related specifications]
.... 5. Gap analysis. This is from feedback from Apple in the WICG issue on WebKit implementation

https://discourse.wicg.io/t/media-timed-events-api-for-mpeg-dash-mpd-and-emsg-events/3096 WICG discussion

Chris: The document right now is quite emsg centric, don't cover too much of the cues that WebKit supports.
.... Something for the next stage.

# Synchronization

Chris: Most significant change recently is about synchronization.
.... I added 5.2 Synchronization of text track cue rendering.
.... The requirement for a 20ms delivery time is a goal that Nigel put forward.
.... If the nominal frame rate is 25 FPS and you can trigger events to within half a frame,
.... there's a chance of achieving frame accuracy, although it's dependent on the execution time of the code.
.... There isn't a way to guarantee frame accurate alignemnt at the moment, but this change could get us a long way.
.... If we can ensure the cue is triggered more closely to its position on the media timeline,
.... this may be good enough for a lot of use cases.

Francois: I wonder if it's the right requirement. This is prescribing a solution.
.... Actually, you might want the event in time to process it, but you still don't know what frame is being displayed.
.... You could imagine a mechanism that ties the firing of cues to requestAnimationFrame,
.... so when you get the event you get an opportinuty to apply changes to the next frame being rendered.
.... rAF gives you a wat to hook into the next rendered frame.
.... Perhaps that's possible too with media. Question about which frame is being rendered right now, previous discussion

https://github.com/w3c/media-and-entertainment/issues/4 MEIG issue on frame accurate seeking

Francois: Should this document be more generic, 3.4 shows what's needed,
.... May having it within 20ms may be enough, or not really the right requirement.
.... When you see the event from an application perspective, you don't have a guarantee that the next video frame will be shown at the next browser frame.
.... I think this is more about the recommendations in section 6. It's not just a case of stating 20ms for event delivery, some synchronisation mechanism is needed.
.... Not too worried about solutions now. Could be adding a timestamp to show the media timeline position related to wall clock time, or another mechanism to tie it to rAF.
.... But the 20ms requirement should be there as it's the minimum we need.
.... I suggest dropping the recommendation from 5.2, and add something to say there needs to be some synchronization between DOM content rendering by the application and the underlying media player.

<scribe> ACTION: Chris remove recommendation from 5.2 and add detail on DOM/media synchronized rendering.

Chris: I agree. I studied time marches on, and wanted to capture what I'd learned.
.... Relating to the HbbTV issue from a few years ago.
.... Two different ways to handle cues, possible to miss short-duration cues,
.... as you may have cues which never appear in activeCues.
.... I have a question about how WebKit handles this, as the video element can accept an HLS manifest URL.
.... So the handling of the timed metadata is done by the UA, more than the application.
.... What kind of constraints there?

Rob: I wonder how specifying a 20ms requirement in a non-real time system will stand up
.... particularly on mobile, will there be the processing power to be able to do that?
.... If something short lived could be missed, and
.... Would the requirement need to be relaxed, so it's target of 20ms, or best effort. Making it a requirement raises concerns.
.... Even a desktop machine is not a real time architecture.

Chris: Good point. I think these constrains led to the exisitng wording in HTML,
.... so user agents could have flexibility in terms of timing.

Rob: It's good to have target numbers in there, but would it be better to classify events as low-priority (250ms cue) vs high-priority (20ms),
.... so you know which ones to service first.

Francois: The actual requirement in section 6 is to have something synchronized to the next media frame.
.... If you have the events in advance, so it could schedule to the synchronization points in advance.
.... The implementer feedback is likely to be that it can't be achieved on certain platforms.
.... Not a reason to drop the requirement, though. If you put it as a target it wouldn't change anything.
.... The goal is to have discussion with implementers, to say we need more and how we can achieve it?

Chris: Recommendation section 6.6 focuses on a specific solution, maybe we can write the underlying requirement.

<scribe> ACTION: Reword recommendation 6.6 to emphasise rendering synchronisation.

<tidoust> scribenick: tidoust

Kaz: If we need more feedback about actual requirements, maybe we should get back to the service providers of HbbTV, ATSC, and HybridCast that have broadcasters and content providers and ask them about their own event handling system requirements, settings or guidelines.

<kaz> scribenick: kaz

Chris: Yes, we do need feedback that we're going in the right direction with the recommendations here.
.... This document is nearly complete. We can invite people for wider reviews.

Rob: To synchronize to the next frame, which is the intent, is a good way to put it.
.... If your processor is busy doing other things, it would be busy to render the next frame as well.
.... So it marks a point where it should check the cues and generate those events.
.... It doesn't matter how much the delay is, it could be a minute of delay worst cast, but it would keep the synchronization as per the intent.

Steve: I agree, that's a really good point.

<tidoust> [It may not be the same processor doing media rendering and DOM rendering in practice, but I agree ;)]

Chris: I added a section "5.3 Synchronized rendering of web resources".
.... This talks about using timeupdate or requestAnimationFrame and polling the media element time.

Francois: I have a comment to the title, the same mechanism can be used for metadata cues.
.... I suggest one section "Synchronized rendering of web resources" with two subsections: one for TextTrack and the other for timeupdate requestAnimationFrame.
.... You can use that mechanism for synchronization too.

Chris: Good suggestion, thank you.

<scribe> ACTION: Chris to restructure sections 5.2 and 5.3 into one section with subsections.

Francois: The HTML spec says that timeupdate isn't intended for synchronisation or cue processing.

<tidoust> https://html.spec.whatwg.org/multipage/media.html#best-practices-for-metadata-text-tracks:event-media-timeupdate Note on timeupdate in HTML spec

Chris: I should add a reference to that.

<scribe> ACTION: Chris to reference advice in HTML not to use timeupdate for synchronised rendering.

Chris: You sent a PR?

https://github.com/w3c/me-media-timed-events/pull/35 PR 35

Francois: I think we can close it, it doesn't change the purpose, but it does indicate time marches on could use clarification.
.... Step 6 isn't really the right place to set the rate at which time marches on is run. We should clarify the rate it happens at this part:

[[

The only requirements in HTML are:

"When the current playback position of a media element changes (e.g. due to playback or seeking), the user agent must run the time marches on steps" (see requirement), which essentially requires the rate at which the time marches on steps run to match the rate at which the current playback position changes; and

"When a media element is potentially playing [...], its current playback position must increase monotonically at the element's playbackRate units of media time per unit time of the media timeline's clock" (see requirement), which sets the speed at which the current playback position increases, but not the rate at which it gets updated.

]]

Francois: This also affects how often currentTime is updated. It increases, but there's now specific requirement for when this should happen.

Chris: We've discussed following up on frame accurate seeking.
.... Do we need to capture this into our current document, or should we leave it to that follow up?

Francois: I'm asking myself as well. This document becomes a natural place to do it.
.... Or move it to another document, I don't have strong opinion about that.

Chris: We have two goals: one is to show the motivation for DataCue, combined with a requirement for synchronization driven by timed text.
.... For DataCue, we haven't really gone into timing requirements for the content replacement use case.
.... I want to get to the point where this document is ready for first publication.
.... Maybe we can continue this discussion and then update the IG note, or treat it as a separate document.
.... I want to have something ready to open up the discussion with browser vendors about DataCue, as a next step.

Francois: Would be easier to split it? The two aspects are orthogonal.

Chris: My preference is keeping all the content at one place, acknowledging that we have two goals.
.... We have enough in this document to complete the explainer.

Next steps
Chris: So next step is to put the DataCue specific parts into the explainer, to go to WICG.
.... And maybe we can continue discussion about synchronization, updating this document.
.... The outcome could be a separate standard work, i.e., we could take that into WHATWG HTML about time marches on.
.... We may want to consider other approaches for synchronization than changing to
.... This document is almost done. Once we've addressed the comments from today, plus those from Nigel,
.... we can do the request for publication as a first draft IG Note.
.... Then I can focus on the explainer document. Rob has updated the draft, so I can add extra content.
.... Maybe we can come back to the synchronization discussion later.
.... Any objections?

<tidoust> +1

Rob: Sounds good. The WICG activity could explore several threads, including synchronization.
.... Those two things (datacue and sync) don't have to be mutually exclusive.. Starting with DataCue would naturally lead to discussion of synchronization.

Chris: Thank you. I'll do as much as I can, then send to browser implementers.
.... Next call on 3rd Monday in March, March 18th.

[adjourned]

Summary of Action Items
[NEW] ACTION: Chris remove recommendation from 5.2 and add detail on DOM/media synchronized rendering.
[NEW] ACTION: Chris to contact Giri to clarify section 3.7.
[NEW] ACTION: Chris to reference advice in HTML not to use timeupdate for synchronised rendering.
[NEW] ACTION: Chris to remove Editor's note and add text regarding deep parsing to the document.
[NEW] ACTION: Chris to restructure sections 5.2 and 5.3 into one section with subsections.
[NEW] ACTION: Nigel/Chris to add word rate to section 3.4 from issue 36.
[NEW] ACTION: Review media analysis visualization use case and decide whether to include.
[NEW] ACTION: Reword recommendation 6.6 to emphasise rendering synchronisation.

Summary of Resolutions
[End of minutes]
Minutes formatted by David Booth's scribe.perl version 1.152 (CVS log)
$Date: 2019/02/19 17:57:56 $


-----------------------------
http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and
may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in
error, please delete it from your system.
Do not use, copy or disclose the
information in any way nor act in reliance on it and notify the sender
immediately.
Please note that the BBC monitors e-mails
sent or received.
Further communication will signify your consent to
this.
-----------------------------




-----------------------------
http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and
may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in
error, please delete it from your system.
Do not use, copy or disclose the
information in any way nor act in reliance on it and notify the sender
immediately.
Please note that the BBC monitors e-mails
sent or received.
Further communication will signify your consent to
this.
-----------------------------

Received on Wednesday, 20 February 2019 10:41:44 UTC