W3C home > Mailing lists > Public > public-web-and-tv@w3.org > July 2018

Minutes from Media Timed Events Task Force call, 16 July 2018

From: Chris Needham <chris.needham@bbc.co.uk>
Date: Tue, 17 Jul 2018 09:29:52 +0000
To: "public-web-and-tv@w3.org" <public-web-and-tv@w3.org>
Message-ID: <590FCC451AE69B47BFB798A89474BB363E1498A5@bgb01xud1006>
Dear all,

The minutes from yesterday's Media Timed Events Task Force call are available [1], and copied below. Many thanks to Giri for chairing the call, and Francois for helping to scribe.

There was one action item:

- Mark to investigate sharing of emsg use cases and requirements from CTA WAVE

Kind regards,

Chris (Co-chair, W3C Media & Entertainment Interest Group)

[1] https://www.w3.org/2018/07/16-me-minutes.html

--
W3C
- DRAFT -
Media and Entertainment IG - Media Timed Events TF

	16 Jul 2018
Agenda

Attendees

Present
	Kaz_Ashimura, Chris_Needham, Giri_Mandyam, Andreas_Tai, Francois_Daoust, Mark_Vickers
Regrets

Chair
	Giri_Mandyam

Scribe
	cpn, tidoust

Contents

Topics
	Agenda
	Open issues
	Next call

Summary of Action Items

Summary of Resolutions

<cpn> scribenick: cpn

Agenda

Giri: Thanks everyone for attending.
.... Let's go through the status of the document.
.... Apologies for the timing confusion in the email, but 8AM Pacific is the right time, 3rd Monday of the month.
.... [reviews the agenda]

<kaz> Agenda for today

Giri: Following Mark's presentation on the CTA in the last IG call, would be good to get an update on other SDOs
.... The Qualcomm people are away at MPEG, so may not be possible today
.... Any additions for today's agenda?

Chris: We should also cover the frame accurate seeking issue in the M&E GitHub.
.... https://github.com/w3c/media-and-entertainment/issues/4

Open issues

Giri: OK. Looking at the open issues against the document: https://github.com/w3c/me-media-timed-events/issues
.... There's a note to add Terminology to the document.

Chris: What terminology should we define?

<kaz> issue 5 on terminology

Giri: There's in-band and out of band events, also media timeline
.... Are there others?

Chris: I don't think so. I'm happy to do some of this myself.

Mark: Can I suggest that where possible we use existing terminology from W3C, MPEG, or elsewhere?

Giri: That's a good point, yes.
.... In-band and out-of-band events are in the DASH spec
.... Media timeline is in the MSE spec.
.... Issue 3 relates to work in DASH-IF, from a request from 3GPP to DASH-IF
.... https://github.com/w3c/me-media-timed-events/issues/3
.... I think there's been some issue with getting a copy of this work. It's in the CTA whitepaper that Mark referenced
.... Members may be aware of this. When documents are in the DASH-IF or CTA repos, how can we get access to them from a W3C perspective?
.... If we take this to the WICG, we'll need to make people aware, otherwise solutions will come up from elsewhere.

Mark: Which is the whitepaper?

Giri: It's "event messages in WAVE", CTA may not be ready to distribute it, as it's work in progress.
.... Could CTA send to W3C under liaison?

Mark: I'll look into that.

<scribe> ACTION: Mark_Vickers to investigate sharing a CTA event message whitepaper with W3C

Giri: On the issue of DataCue API, we had a lot of good discussion. But what do we want to give as input to WICG?
.... We have a description of the emsg structure, and how HbbTV handles it.

<inserted> issue 2 - DataCue or a more specific API for DASH and emsg events

<kaz> Media Timed Events draft - section 4.1.1 DASH and ISO BMFF emsg events

Giri: We don't have to go into solutions, point to some potential approaches, let WICG handle it.
.... Other info needed?

Chris: An agreed set of use cases, maybe the ones so far aren't enough.

Giri: We could close the last two issues.

Mark: I would recast this as not DataCue or a more specific API, but more are there any use case / requirement gaps that could not be solved through DataCue.
.... I think it's OK to present emsg as a requirement, but we shouldn't come up with an alternative API here.

Giri: Looking at issue #6 (use cases) https://github.com/w3c/me-media-timed-events/issues/6
.... We'll be able to improve this when we have the DASH-IF or CTA documents
.... I was hoping the people working on the MPEG effort would be able to contribute to this.

<kaz> Media Timed Events draft - 2. Use cases

Giri: I wrote some text to contribute to section 2.2 - https://github.com/w3c/me-media-timed-events/pull/8
.... I mentioned rendering of social media feeds, banner ads, accessibility assets not addressed by current mechanisms (based on caption tracks)

<kaz> latest changes

Chris: TAG invited MPEG people to a call to discuss embedding of web resources. Will report back when that happens.

Giri: Maybe my additions should go to 2.3?
.... [discussion of where it best fits]
.... The metadata cue could contain a URL to be requested and rendered
.... The DASH-IF and CTA papers describe in-band timing information. I wonder if this is something that should be standardised, or remain proprietary?

Andreas: A question about the use case for rendering captions. You mentioned large print rendering of captions. Do you see DataCue as a way to render captions, apart from WebVTT?

Giri: You'd like to have the media player render the event. The emsg is a general data carriage mechanism. You'd define a protocol within emsg to make that occur. That's how I'd interpret it.
.... A TTML or WebVTT track, the TextTrackCues identify as captions. With emsg, that would have to be defined on top.
.... The reason I mentioned the large print captions. If TTML isn't suitable for that, an emsg could refer to a large print resource.
.... If anyone would like to contribute to improve the use cases, please do!
.... If I can get the DASH-IF presentation for next time, we can cover their use cases.
.... I have a PR for subtitle timing accuracy. https://github.com/w3c/me-media-timed-events/pull/9
.... I actually derived this from the BBC Subtitle Guidelines http://bbc.github.io/subtitle-guidelines/
.... Is this a suitable requirement?

<tidoust> scribenick: tidoust

Chris: I think these guidelines are for content authoring. The information that you pulled out is more to do with presentation, e.g. subtitles should not anticipate speech by more than 1.5 seconds.
.... This document contains useful links to EBU-TT-D, where Annex E talks about when things are rendered in the media timeline, and how timing should be preserved in the face of frame rate adjustments. Rendering is expected to happen "as close as possible" to the authored timestamps.

<kaz> EBU-TT-D file

Chris: The implication there is that when you're authoring the captions, you can do things precisely, with the player being able to play things in sync.
.... This suggests a stringent time requirement than what you define in the pull request. I'm actually wondering about linking to EBU-TT-D.

<cpn> https://tech.ebu.ch/files/live/sites/tech/files/shared/tech/tech3380.pdf

Andreas: The question in the thread opened by Francois is on whether we need frame accuracy for subtitles rendering. This has e.g. been discussed for HbbTV as well. I wonder how hard this requirement is.

<cpn> https://github.com/w3c/media-and-entertainment/issues/4

Andreas: How do we judge on this requirement on actual TV screens today?

Chris: My feeling is that it is not so stringent but I'm not an expert there.

Andreas: We have made some tests for DVB. We also discussed with our editorial partners in Germany, Austria, Switzerland, and they did not report on frame-accuracy requirements.
.... This may differ per country, but I do not see that people have been chasing TV manufacturers about accuracy of captioning / teletext rendering.

Giri: We can say that it is just a loose requirement. They are probably other events that require stricter synchronization.

Chris: Referring to the frame accuracy discussion on GitHub, the user @daiz discusses avoiding overlap of scene changes with subtitles. He wants to be able to align subtitles on scene changes, which suggests frame accuracy is needed.

<cpn> https://github.com/w3c/media-and-entertainment/issues/4#issuecomment-396762643

Mark: I think there is an important distinction here. Traditionally, even for analog TV, things came in-band, so fairly accurate. How long the device takes to process that is indeed a requirement.
.... I think we need to preserve in a similar way as accurate as possible from the signal to the application, but how long it takes for the application to display is a separate issue.

Kaz: I personally agree with Mark. On the other hand, it's a bank holiday in Japan today, and we don't have experts from Japan. So we might want to ask them as well for opinions/feedback later.

Andreas: Thanks for the distinction, Mark. Nonetheless, I think it makes sense to check how accurate current devices are. Our experience shows that they are not frame-accurate at all. I don't say that's good, but we can take this as an indication that there may not be a strong incentive to get that frame accuracy.

Giri: This is something we struggled with in ATSC as well. The device may make the events available accurately, but the application may take time to process it.
.... One workaround is to trigger the event earlier, and let the application do fine adjustments.
.... Another workaround is allowing the application to react immediately to the event in a synchronous manner.
.... In general, it's not an easy problem to solve. It's more difficult when the signal is received by a set-top box and distributed to various devices on the home network.
.... I think I'm kind of agreeing with Andreas that if you can avoid frame accuracy requirement, then that would make things easier.

Mark: If it's the application that displays the captions, then that's not really an API issue. The scope of the API is only to send the events in a very timely manner.

Giri: One thing to discuss is the ability to send data to an application on as close to synchronous manner as possible. That would serve as input to WICG to design a solution. I'll add that to the use cases.

<MarkVickers> FCC caption timing requirements: "(ii) Synchronicity. Captioning shall coincide with the corresponding spoken words and sounds to the greatest extent possible, given the type of the programming. Captions shall begin to appear at the time that the corresponding speech or sounds begin and end approximately when the speech or sounds end. Captions shall be displayed on the screen at a speed that permits them to be read by viewers." https://www.ecfr.gov/c[CUT]

<MarkVickers> https://www.ecfr.gov/cgi-bin/text-idx?SID=72eb5a624e8dc043293819a5663dff41&node=47:4.0.1.1.6.1.1.1&rgn=div8=47

<MarkVickers> Please note I just did a quick search for FCC requirements. There may be more numerical requirements elsewhere...

MarkVickers: I just pasted FCC requirements. They don't include numbers.

Next call

<Zakim> kaz, you wanted to confirm the next call will be held on August 20th

Kaz: Next call will be August 20th, right?

Giri: Yes, it is.

<kaz> [adjourned]

Summary of Action Items
[NEW] ACTION: Mark_Vickers to investigate sharing a CTA event message whitepaper with W3C
 
Summary of Resolutions
[End of minutes]
Minutes formatted by David Booth's scribe.perl version 1.147 (CVS log)
$Date: 2018/07/16 21:41:08 $
Received on Tuesday, 17 July 2018 09:31:21 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 17 July 2018 09:31:22 UTC