Re: [media-and-entertainment] Media Source Extensions

Please see also FOMS 2018 notes on MSE discussions for vNext feature incubations, many were focused on "low-latency" / "live" / "buffer management" / "GC interop-improvements": 

Session 1: http://www.foms-workshop.org/foms2018/pmwiki.php/Site/MSE
Session 2: http://www.foms-workshop.org/foms2018/pmwiki.php/Site/LL-MSE
My high-level summary:
**MediaSource Extensions (MSE) (quite a lot covered, main points:)**
* Status update
  *  MSE vNext:
    * WICG setup for incubations, tracking issues on main MSE github
    * changeType: first vNext incubation (Chrome, Firefox, Safari), YT using it for seamless AV1<->VP9 adaptations
  * REC MSE updates: upcoming deprecations
    * multiple tracks in a SourceBuffer set with ‘sequence’ append Mode
    * possibly createObjectUrl (though that’s unlikely to happen since breaking change)
  * MediaError.message was still unknown to some

* Upcoming/in-progress work on MSE vNext:
  * MSE-in-workers, including mechanism for improving MSE setup latency (“sourceopen”)

* Open ended discussion focused heavily on MSE vNext improvements around:
  * “Live” “Low latency” “Gap-skipping” “buffer/GC mgmt” related features 
    * These dovetail with, but they didn’t want tied directly to the media element latencyHint/jitter buffer control API (we're looking to specify something like that soon instead of relying on implementation-specific liveness parsing heuristics)
    * SourceBuffer API for specifying a GC mode (e.g. aggressive GC) can enable "infinite" GOP playbacks, roughly:
      * "default" = current implemented implementation-specific heuristic
      * "aggressive" = allow GC/eviction to occur throughout playback (not just at appendBuffer() or remove() synchronous points) and allow eviction of *anything* prior to current demuxer/playback head (including current GOP's keyframe). Apps would be responsible for seeking responsibly (for example, being aware that seek-to-current-time in this mode might stall). This enables "infinite GOP" scenarios and associated reduced network bandwidth jank in streaming media containing just 1 keyframe followed by nonkeyframes.
      * “LL-CMAF?” =  third kind of MSE GC mode that preemptively evicts while playing (not just at appendBuffer() or remove() times), but doesn't evict anything from the start of the current GOP forwards. Such a mode would support the CMAF-low-latency type players which have frequent keyframes, and help reduce memory pressure from already-played media (and also allow the element's decoders to be suspended in scenarios like background-tab/etc unlike the more "aggressive" mode I mentioned, above).  Apple/Safari promoted this mode in particular.
    * Way to specifying a cap on SourceBuffer resources (time or bytes TBD) 
    * SourceBuffer API (tbd) for specifying gap fudge factor (e.g. all gaps < 100ms must coalesce and not be a gap reflected via SourceBuffer.buffered) can reduce interop issues.
    * MediaSource (or perhaps HTMLMediaElement) API (tbd) for modifying playback behavior across surviving gaps:
      * Default: v1 MSE behavior: stall at buffered range intersection gaps. Reflect those gaps in media element buffered ranges.
      * Play-through: play silence for missing audio and show most current video frame. Once client media clock reaches next audio (or video) frame, play that audio (or video). Essentially, try to preserve the media timeline (no auto-seek). If app seeks to a position in such a gap, allow it, and play silence/(no?) video frame until clock reaches end of gap. Reflect no gap in media element buffered ranges.
      * Fancy-seek: Hide gaps. If no audio and no video are available for current media time, seek to earliest of next available buffered audio or video and resume playback from there. Open question: give app some event indicating this "fancy-seek" has occurred? Reflect no gap in media element buffered ranges (though they should be apparent in the intersection of activeSourceBuffers' buffered ranges). Seek to such a partial or both A/V gap should be allowed and not stall (and then auto-fancy-seek from there if the target had no A/V?)
      * Don't engage these modes automatically from the media element "low latency" hint API (tbd) because apps might want to *not* involve a jitter buffer, but still retain v1 MSE buffering behavior.
      * mediaElement.buffered should show what is expected w.r.t. playback stalls. sourceBuffer.buffered should show what is actually buffered (not hiding gaps)
  * Better debugging/information:
    * “What is the timestamp/resolution/codec/bitrate/profile/etc of what is playing right now?” (possibly via  MediaPlaybackQuality? or MCAPI?)
    * “What is the earliest PTS of this GOP’s presentation interval?” (For apps to use to guide their own explicit version of something like the "LL-CMAF?" gc mode, above.)
    * “What media was actually removed by my call to SourceBuffer.remove()?”
    * “Can I haz appended media tags and a way to retrieve a timeline showing where those tags are?”
    * “Can I haz promises with MSE?”
    * Chrome: hot-link from devtools to specific media-internals player log

**Many of these already have associated MSE github issues. Work will be ramping up on specifying solutions to these.**

p.s. I wish I had been able to make it to TPAC this year to meet f2f with folks there, too, regarding MSE vNext feature ideas, proposals, and incubations. Please file issues or follow-up on existing ones in the main MSE github (https://github.com/w3c/media-source/issues) to help us gain traction on reaching ergonomic APIs that improve usability and interoperability of MSE.

-- 
GitHub Notification of comment by wolenetz
Please view or discuss this issue at https://github.com/w3c/media-and-entertainment/issues/6#issuecomment-432860926 using your GitHub account

Received on Wednesday, 24 October 2018 23:22:22 UTC