Summary of Media & Entertainment IG meeting at TPAC 2019

Dear all,

The minutes from the Media & Entertainment Interest Group meeting during TPAC 2019 are available [1].

Here's a summary of the main outcomes from the IG meeting and some of the media-related breakout sessions at TPAC.

# Hybridcast

Ikeo-san (NHK) presented an update on Hybridcast, focusing on the control API [2] and media timed events. They are working on adding security, through the HTTPS in the Local Network CG, and also have interest in the Open Screen Protocol being developed by the Second Screen CG, which includes a security layer. Ikeo-san showed use cases for media timed events for emergency notifications.

NHK are developing a Web of Things Thing Description [3] for the API, and showed a demo of this during the breaks. This is something for the M&E IG to follow up with the Web of Things Interest group.

# Media Timed Events Task Force

Chris Needham (BBC) presented a report from the Media Timed Events Task Force, which has published a v1 draft use case and requirements document [4].

Some feedback from the meeting was to be explicit on the scope (e.g., in relation to broadcast emergency events), and to make more specific recommendations. The Task Force will review the document based on this feedback.

There was a breakout session "DataCue and Time Marches on in HTML" that continued the discussion on the DataCue API and synchronised event triggering [5]. There, it was suggested to add a note to the HTML spec to clarify that "time marches on" is expected to run whenever the media playback position changes, and to develop test cases for the triggering of cue enter and exit events. The breakout also discussed the pros and cons of parsing cue types by the user agent or the web application, e.g., user agents could parse well known cue types while handing others to the application.

When the Task Force has updated the use case and requirements document, it will circulate this with the IG.

In a joint meeting between the Timed Text WG and Media WG, Cyril presented an architecture for media-timed event processing, including both DataCue and the Generic Text Track Cue [6]. The conclusion of the joint meeting was that the M&E IG should look at the architecture aspects, with DataCue in WICG for in-band and application-generated events, and a new WICG project for the Generic Text Track Cue.

I propose that the Media Timed Events Task Force remains open to continue this work.

# CTA WAVE

John Riviello (Comcast) gave an update on the CTA WAVE project.

The WAVE Content Specification for 2019 is adding CMAF encoding for DASH-HLS interoperability. John showed the DASH-IF validator, which can test content against WAVE requirements.

The WAVE Web Media API Snapshot is jointly published by CTA and W3C as a CG report [7]. It focuses on the four most widely adopted user agent code bases: Chromium, Edge (to be replaced by Chromium Edge), Firefox, and Webkit. The 2019 update to this document removes CSS Profiles which includes a TV profile.

One item mentioned was possible standardisation of Type 1 playback (i.e., browser-native MPEG-DASH and HLS players). If there's sufficient interest, the IG would be happy to host further discussion.

# Frame accurate seeking and rendering

Francois Daoust (W3C) gave a summary of the some of most active of M&E IG's GitHub issues [8, 9].

Use of rational numbers for timestamps to help identify and address specific video frames would be useful. A solution could be to work with ECMA TC39 to define a rational type in JavaScript. Also discussed were gapless dynamic content insertion, and to consider the seeking and rendering aspects separately from each other.

Discussion naturally led to the next topic:

# Professional media workflows on the web

Pierre Lemieux (Sandflow) presented an introduction to cloud based media production workflows [10], and explained how the browser is becoming the platform for editing and producing media content, highlighting several gaps in the web platform: sample accurate playback, HDR and wide gamut colour support, improved subtitle and caption support.

Following this and the previous discussion on frame accurate seeking and rendering, the participants agreed that starting some activity to explore this area to document use cases and API gaps would be useful.

The IG co-chairs propose creating a document that describes the API gaps on the web platform for media production use cases.

# 360 Video

Samira Hirji (Microsoft)  asked for input from content providers on experiences producing 360 video, and discussed the need for standardisation, e.g., of extensions to the HTML <video> element to support projection data. We followed up this discussion in the "Standardizing 360 Video" breakout [11, 12].

There was agreement on the need for 360 video support, e.g, through a dedicated layer type in WebXR that can accept a video. There wasn't a consensus expressed around adding such extensions to the <video> element,
While native support would require effort from multiple browser vendors (and existing support is still painful for web developers), a good middle ground could be to provide a custom 360 video element, as a way to collect feedback and inform a future standard. Mozilla has already created a custom element [13] that we could potentially make improvements upon as we learn more (for example, to add subtitle support). To follow up, there is now an open GitHub issue in the Immersive Web CG proposals repo [14]. Samira is interested in hearing feedback from content providers of 360 video content to make sure that the end-to-end process is understood (could include what hardware and software they use to create/stitch the videos, reasons behind any of their decisions/choices or pain points/concerns specific to spherical video). Please add comments to the GitHub issue [14].

Andreas Tai (IRT) raised the topic of captioning for 360 video and VR experiences, and the difficulty of selecting the right venue and bringing the right people together to progress this topic. The main requirement is to have subtitles that are always in the field of view. Features are needed in TTML to allow content authors to indicate where in 3D space the audio source is coming from.

Josh O'Connor (W3C) referred the group to the Web Accessibility Initiative's accessibility user requirements for XR [15]. A productive discussion was had in the "Standardising 360 Video" breakout [12], where one recommendation was to implement a library to help inform the need for a web API standards, and that WebXR is developing a DOM overlay API that could be used for caption rendering. An Immersive Captions CG has now been created [16] which is also discussing this issue.

# Bullet chatting

Presentations from Song Xu (China Mobile) and Michael Li (DWANGO) introduced the bullet chatting / danmaku video comment overlay use case. The topic discussed further in a breakout session [17] and a joint meeting between the Chinese IG and Timed Text Working Group, which concluded that most of the required capabilities were supported in TTML2, and that further gap analysis is required [18].

Since TPAC, a Bullet Chatting CG has been created [19]. The IG is interested in working with the CG on the gap analysis.

# Second Screen CG/WG joint meeting

Mark Foltz (Google) gave an introduction to the Presentation API [20], Remote Playback API [21], and Open Screen Protocol [22].

An interesting use case being considered for a future iteration is enabling web applications to generate their own media and present it to a connected display, e.g., for gaming.

We discussed synchronization of content playback on second screen, also how the pairing and authentication parts of Open Screen Protocol could be more widely useful (e.g., for the Hybridcast TV control API).

The Second Screen CG is not currently looking into playback of EME protected content via the Remote Playback API and Open Screen Protocol. This is a topic where IG members could voice their support if this capability is needed, and help with the development of the protocol.

Feedback to the Second Screen CG is welcome on new requirements and use cases, and on the Open Screen Protocol's extension mechanism.

# Timed Text Working Group joint meeting

In addition to captioning in 360 video, as already mentioned, at the joint meeting with the Timed Text Working Group, Gary Katsevman (Brightcove) gave a progress update on WebVTT [23], which is progressing towards REC status, although some features are at risk.

Nigel Megitt (BBC) raised the question of whether MSE should support timed text in addition to audio and video, a topic to follow up with the Media WG..

# Breakout sessions

There was a good number of media related breakout sessions during the unconference day. You can find notes from all the breakout sessions at [24], and a few highlighted below:

* Efficient audio/video processing
  https://www.w3.org/2019/09/18-mediaprocessing-minutes.html

* Next generation TextTrackCue
  https://www.w3.org/2019/09/18-textcueapi-minutes.html

* Web Codecs
  https://www.w3.org/2019/09/18-webcodecs-minutes.html

* Image formats
  https://www.w3.org/2019/09/18-images-minutes.html

* HTML 3D Element & Native glTF
  https://www.w3.org/2019/09/18-html-3d-minutes.html

* WebGPU
  https://www.w3.org/2019/09/18-webgpu-minutes.html

Finally, a reminder that our next conference call is planned for November 5th, I'll share the agenda next week.

Kind regards,

Chris (for the IG co-chairs)

[1] https://www.w3.org/2019/09/16-me-minutes.html
[2] https://www.w3.org/2011/webtv/wiki/images/d/d3/RecentAchievementHybridcast_TPAC20190916.pdf
[3] https://w3c.github.io/wot-thing-description/
[4] https://w3c.github.io/me-media-timed-events/
[5] https://www.w3.org/2019/09/18-datacue-minutes.html
[6] https://www.w3.org/2019/09/19-mediawg-minutes.html#item01
[7] https://w3c.github.io/webmediaapi/
[8] https://github.com/w3c/media-and-entertainment/issues/4
[9] https://www.w3.org/2019/Talks/TPAC/frame-accurate-sync/
[10] https://www.w3.org/2011/webtv/wiki/images/8/88/M%2Be-web%2Bpro-workflows.pdf
[11] https://onedrive.live.com/view.aspx?resid=8491594C071C674!22022&ithint=file%2cpptx&authkey=!AGNUMyEwLTNLRgs
[12] https://www.w3.org/2019/09/18-360video-minutes.html
[13] https://blog.mozvr.com/custom-elements-for-the-immersive-web/
[14] https://github.com/immersive-web/proposals/issues/55
[15] https://www.w3.org/WAI/APA/wiki/Media_in_XR
[16] https://www.w3.org/community/immersive-captions/
[17] https://www.w3.org/2019/09/18-bulletchat-minutes.html
[18] https://www.w3.org/2019/09/20-tt-minutes.html#x32
[19] https://www.w3.org/community/bullet-chatting/
[20] https://w3c.github.io/presentation-api/
[21] https://w3c.github.io/remote-playback/
[22] https://webscreens.github.io/openscreenprotocol/
[23] https://www.w3.org/TR/webvtt1/
[24] https://w3c.github.io/tpac-breakouts/sessions.html

Received on Tuesday, 22 October 2019 16:02:20 UTC