[me] minutes - 1 February 2018

available at:
  https://www.w3.org/2018/02/01-me-minutes.html

also as text below.

Thanks a lot for chairing the call and taking these minutes, Chris!

Kazuyuki

---

   [1]W3C

      [1] http://www.w3.org/

                               - DRAFT -

                 Media & Entertainment IG monthly call

02 Feb 2018

   [2]Agenda

      [2] https://lists.w3.org/Archives/Member/member-web-and-tv/2018Feb/0000.html

Attendees

   Present
          Kazuhiro_Hoya, Geun-Hying_Kim, Steve_Morris,
          Giri_Mandayam, Mohammed_Dadas, Dave_Evans, Will_Law,
          Tatsuya_Igarashi, Francois_Daoust, Paul_Jessop,
          Kazuyuki_Ashimura, Chris_O'Brien, Colin_Meerveld,
          David_Evans, Geun_Hyung_Kim, George_Sarosi,
          Chris_Needham, Louay_Bassbouss

   Regrets

   Chair
          Chris

   Scribe
          Chris, Kaz

Contents

     * [3]Topics
         1. [4]DASH Eventing and HTML5
     * [5]Summary of Action Items
     * [6]Summary of Resolutions
     __________________________________________________________

   <cpn> Scribe: Chris

   <cpn> Scribenick: cpn

DASH Eventing and HTML5

   <kaz>
   [7]https://www.w3.org/2011/webtv/wiki/images/a/a5/DASH_Eventing
   _and_HTML5.pdf Giri's slides (Member-only)

      [7] https://www.w3.org/2011/webtv/wiki/images/a/a5/DASH_Eventing_and_HTML5.pdf

   <kaz> [Introduction]

   Giri: This is a brief intro to ongoing work in MPEG, and what
   we've done in ATSC
   ... There are 2 types of events we deal with in DASH
   ... DASH is adaptive streaming over HTTP, designed to leverage
   HTTP for streaming media, live or on-demand
   ... Media Source Extensions and Encrypted Media Extensions, as
   well as the audio and video media tags deal with this
   ... Interactivity events, absolute or relative time
   ... DASH defines two ways to deliver events: in the MPD
   manifest XML file, it describes the segments in the streaming
   service
   ... Then there are in-band events, in an emsg box in the ISO
   BMFF media track
   ... ISO BMFF is a packaging format defined by MPEG, the most
   popular format of DASH packaging.
   ... There are other forms, WebM being popular also
   ... Issue with synchronisation, media playback should be
   handled by the native media player
   ... There are two things needing synchronisation: the media
   player and the web page in the browser context
   ... emsg eventing is a more dire situation, not supported by
   browsers
   ... in the byte stream registry, there's no requirement for a
   browser implementation
   ... only custom browsers deal with emsg data, not mainstream
   browsers
   ... this was problematic in designing ATSC

   <kaz> [How does HTML5 handle DASH events today?]

   Giri: this is just my opinion, not authoritive
   ... HTML has TextTrackCue with an identifier, text string,
   start and end time, and a payload
   ... There can be metadata cues
   ... If you have a DASH event on the transmitter side, this
   could transcode in-band events into text track cues, and
   present them in the text track
   ... Here's an example from the WebVTT spec
   ... There's a separation between cues to be handled by the
   player and those to be handled by the application

   <kaz> [HTML5 Handling of Text Track Cues]

   Giri: In HTML5, the video track allows for track specific event
   handlers, oncuechange event
   ... There was a proposal for DataCues with binary payloads
   ... Browser vendor support is non-existent AFAICT
   ... There's a 4 year old discussion on the Chromium mailing
   list
   ... HbbTV has also identified problems with short duration
   cues, where cues may expire before the UA could handle them
   ... There's a specific problem in ATSC where we try to minimise
   channel acquisition
   ... i.e, start playback as quickly as possible on channel
   change
   ... There's a danger with mid-cues if delays are introduced
   ... If the user just acquires a channel, cues may be missed

   <kaz> [ATSC 3.0 Approach]

   Giri: ATSC 3.0 defined two playback models: the application
   media player (AMP) and the receiver media player (RMP)
   ... AMP is a standard HTML/JS app, such as dash.js
   ... This is suitable for certain kinds of devices, without an
   integrated receiver, taking advantage of a standard browser
   context
   ... Then there's the RMP. This is colocated with the AMP, and
   rendering takes place in conjunction with the receiver.
   ... Control of the RMP is done over WebSockets

   <kaz> [ATSC 3.0 Event Handling]

   Giri: As far as event handling is concerned, the AMP runs in
   the browser context, although emsg isn't supported in most
   browsers
   ... This is a problem for the AMP. The RMP, as it's integrated,
   there's room for customisation
   ... The RMP can convey event data to the app over WebSockets
   ... Both methods have latency in event handling
   ... We don't see perfect solutions here in ATSC

   <kaz> [Event Retrieval in ATSC 3.0]

   Giri: This diagram is from ATSC. It's not synchronous. We
   discussed having event subscription
   ... We believe this is HTML5 compatible, even though we're not
   using the HTML video tag

   <kaz> [Emerging Approach]

   Giri: To address some of these issues, MPEG has started work on
   carriage of web resources via ISO BMFF
   ... It's a joint proposal from Cyril Concolato at Netflix and
   Thomas Stockhammer
   ... It allows for direct rendering, so not dependent on the
   application. This could take care of some of the perf issues I
   mentioned
   ... We can't force a broadcaster to write an app per service,
   can be done by the content author
   ... It's work in progress

   <kaz> [Conclusion]

   Giri: If the media player has an integrated runtime media
   player, it's possible to deal with it directly
   ... MPEG considering approaches
   ... That completes my overview

   Igarashi: Thank you Giri for the presentation
   ... You mentioned discussion with browser vendors, what is the
   issue there, why don't they support event cues?

   Giri: It's the emsg that isn't supported. We're considering it
   for broadcast media, and I guess they are thinking more about
   online media
   ... emsg was also controversial in MPEG, not too many
   proponents
   ... not popular from a content authors point of view

   Will: emsg is gaining prominence through its adoption at CMAF
   ... We have a strong preference for a usable emsg
   implementation in browsers
   ... The SourceBuffer is the appropriate place to extract the
   data
   ... We've started a discussion with Google, Microsoft, and
   Apple on this

   Giri: I fully expect CTA WAVE to be involved in this. It would
   be great if we can get a report from them on preferred
   approaches

   Igarashi: It's good news that CTA WAVE is considering how to
   handle emsg in HTML5
   ... Does the HTML cue API need changes to support emsg, or is
   it just an implementation issue?

   Will: emsg can hold binary payloads and TextTrack cues are
   text, so you'd need to encode, eg with blase64, so we need a
   way to expose arbitrary binary payloads
   ... Is there broader interest from the M&E IG in emsg events,
   and what's the preferred method to deliver events to the JS
   layer?

   Giri: We don't really have a way to handle typed data with
   TextTrack cues
   ... With broadacst media, we worry about exploding with track
   data,
   ... e.g, the init segment has to be frequently repeated so that
   players can start playing quickly

   Will: Mark Vickers, who's in CTA WAVE, was involved in the
   DataCue work. Can DataCue be revitalised?

   Francois: You mentioned synchronisation needs with event
   handling. Right now in HTML5, the timeline for media playback
   isn't necessarily the one that the app sees
   ... What are the synchronisation requirements there?
   ... What kinds of cues are used in practice? What are some good
   examples needing precise sync?

   Giri: In smart TVs, we're doing more app logic for
   personalisation, e.g., ad campains. We want to customise to the
   device owner, the consumer.
   ... This means that client logic is needed, and ad media needs
   to be available and ready when the cue appears
   ... If there's uncertainty about how the UA surfaces event
   data, and as the time references aren't perfectly aligned,
   there maybe issues with the actual insertion
   ... This was also a problem in TextTrack cues with several
   hundred milliseconds latency, you could miss an ad-insertion
   cue and get dead air. This is something TV manufacturers and
   broadcasters want to avoid.

   Francois: I have another question about binary data. TextTrack
   cues don't support this, and DataCues aren't implemented. What
   is binary data used for?

   Giri: It's for any other data that needs to be time alinged,
   that's typed, eg, JPEG images, or simple audio files that are
   related to the media being played
   ... Anything where you don't want to deal with the round trip
   time of requesting the resource, so you want it in-band.

   Igarashi: MPEG-DASH uses emsg as an arbitrary generic format.
   If MPEG-DASH has a specific use, it may also select to use
   emsg.
   ... In terms of frame-accurate eventing, as Francois said I
   don't see any specific requirement. Ad insertion won't be
   achieved at the app level, it's more at the system or rendering
   level.
   ... Some broadcasters maybe want to synchronise web content
   with the media, e.g., information about a soccer player during
   a game.
   ... I see these as rarer applications. Accuracy to only about
   100 ms is needed, not frame accuracy, for broadcast systems.

   Giri: The W3C specs don't guarantee 100 ms accuracy, something
   that HbbTV complained about.
   ... There are other issues than UA latency that result in
   missing cues. Hence the MPEG work, which should take some of
   the uncertainty out of processig the events.
   ... Frame accuracy isn't critical, but 500 ms isn't good
   either.

   Igarashi: I think 300 ms is enough in most cases.

   Giri: In my time at ATSC, I haven't seen an accurate timeline
   inserted from the time of introducing the cue in the
   transmission infrastructure to when the client must complete
   its logic.
   ... That could be good for this group to do, no-one else is
   looking at this from an HTML5 point of view.

   Kaz: Would it make sense to invite CTA WAVE to give an update?

   <kaz> scribenick: kaz

   Chris: I have discussed that with Mark
   ... He said he'd prefer to wait until after NAB up in April, so
   maybe for our monthly call in May?

   Kaz: tx for your clarification

   Chris: What should the next steps be in this interest group?

   <scribe> scribenick: cpn

   Will: The IG brings lots of real world use cases
   ... If we can specify emsg event handling, timing requirements,
   in addition to what's coming from CTA

   Igarashi: I agree, also how emsg are used for services
   ... We should discuss how emsg can be used for broadcast
   systems, other requirements

   <kaz> scribenick: kaz

   Chris: We have an unfortunate schedule overlap with TTWG, who
   also meet on Thursday afternoons
   ... This topic is clearly in their area of interest, so I want
   to discuss together with them.
   ... I know that TTWG have a general issue regarding TTML
   browser implementations, and a proposed
   ... solution is passing responsibilities more on the app layer,
   with an extended TextTrack API.
   ... I'd like to move the time of this call to avoid the
   schedule overlap, so that we can share
   ... information with the TTWG guys. But I'm not sure when to
   move to at the moment.
   ... It could be moved to a Tuesday or Wednesday at a similar
   time.
   ... I will try to identify a better slot based on people's
   availability.
   ... Also, we can the kind of use cases and requirements around
   synchronization and timing requirements.
   ... We could start comparison on the wiki, etc.
   ... Maybe everything is covered by the CTA's work, but would
   like to see input from the wider Web community
   ... For example, during the breakout session at TPAC, there was
   mention of requirements for synchronising
   ... web content with audiobooks. This is another group we may
   contact to see if we cover all their requirements.

   <tidoust> [8]Synchronized Multimedia for Publications CG

      [8] https://www.w3.org/community/sync-media-pub/

   Chris: I can take an action item to do that.

   Kaz: Maybe we can start some work on gathering use cases and
   requirements on the wiki or GitHub?

   Chris: This would be useful, also with input from TTWG.
   ... But, it would be good to have an initial proposal for
   people to respond to.
   ... Also use cases coming from the media industry, as Igarashi
   mentioned.
   ... Unless any other points for today, we can adjourn the call.
   ... Thanks, Giri!
   ... And thank you to all for attending.
   ... As a reminder, it would be good to hear from you about
   topics for future calls. Please get in touch.

   [adjourned]

Summary of Action Items

Summary of Resolutions

   [End of minutes]
     __________________________________________________________


    Minutes formatted by David Booth's [9]scribe.perl version
    1.147 ([10]CVS log)
    $Date: 2018/02/02 10:21:34 $

      [9] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
     [10] http://dev.w3.org/cvsweb/2002/scribe/

Received on Friday, 2 February 2018 14:19:16 UTC