[Minutes] Media WG Call - 2022-11-08

Hi all,

The minutes of the last Media WG call are available at:
https://www.w3.org/2022/11/08-mediawg-minutes.html

... and copied as raw text below. Precise technical points around MSE 
window append behaviors went too fast for scribe to both understand and 
capture. Feel free to complete as needed or raise comments on related 
GitHub issues!

Thanks,
Francois.

-----
Media WG
08 November 2022

    [2]Agenda. [3]IRC log.

       [2] 
https://github.com/w3c/media-wg/blob/main/meetings/2022-11-08-Media_Working_Group_Teleconference-agenda.md#agenda
       [3] https://www.w3.org/2022/11/08-mediawg-irc

Attendees

    Present
           Chris Needham, Eric Carlson, Francois Daoust, Frank
           Liberato, Greg Freedman, Jean-Yves Avenard, Jer Noble,
           Joey Parrish, Karl Tomlinson, Mark Watson, Matt
           Wolenetz, Sushanth Rajasankar

    Regrets
           -

    Chair
           Chris Needham

    Scribe
           cpn, tidoust

Contents

     1. [4]Agenda bashing
     2. [5]MSE Interoperability/implementation conformance issues
          1. [6]Gapless audio support feature detection
          2. [7]Relax timing constraint of initial HAVE_METADATA
             and later HAVE_CURRENT_DATA
          3. [8]API for interoperable gap playback/tolerance
          4. [9]Reflections on MSE in Worker and WG review process
     3. [10]Media Capabilities editors

Meeting minutes

   Agenda bashing

    cpn: Main topic is MSE Interoperability/implementation
    conformance issues, including perhaps reflections on changes
    that needed to be brought to MSE in worker
    … Also topic on media capabilities towards the end of the call

   MSE Interoperability/implementation conformance issues

     Gapless audio support feature detection

    <ghurlbot> [11]Issue 37 Support sample accurate audio splicing
    using timestampOffset/appendWindowStart/appendWindowEnd
    (Melatonin64) feature request, needs author input,
    TPAC-2022-discussion

      [11] https://github.com/w3c/media-source/issues/37

    Matt_Wolenetz: When the presentation coded frame of two audio
    fragments overlap, I believe the spec describes a cross-fade.
    Chrome used to have that, but the widespread proliferation of
    incorrect timing and codec formats
    … meant that we now truncate.
    … We couldn't perceive any benefit with cross-fade.
    … Meanwhile, chromium implementation allows to do frame
    accurate splicing.
    … In order to do that, you must not drop everything that was
    decoded from both audio frames.
    … And perform trimming.
    … Issue is whether we can standardize this feature:
    frame-accurate audio frame post-decoding splicing
    … And also can we add feature detection for whether this is
    supported by the browser

    Jean-Yves: In Firefox, the data will be trimmed once decoded.
    More accuracy than the 1048 sample.

    Matt_Wolenetz: You need to feed more than just the frame so
    that at the splice point, this is stable.
    … Yes, that is what I'm talking about.

    Jean-Yves: Multiple encodings use different flags. You end up
    with conflicts, e.g. with ADTS.
    … Is it something that will be within MSE, like a message to
    send, or contained within the binary format that we append.

    Matt_Wolenetz: In terms of specs, yes. In terms of the binary
    format, the decoded sequence needs to be appended in a
    well-formed sequence by the app, so that the splicing point
    gets clearly set.
    … The app needs to add enough of the B frames so that the
    splicing can happen smoothly.

    Jean-Yves: The way MSE is designed is that it's always dealing
    with compressed frames rather than decoded frames.

    Matt_Wolenetz: We check in Chromium append calls for positions
    where splicing might append, especially for audio.
    … And we keep track of them.
    … We decode across the splice point faster than real-time so
    that we can adjust things. That enables the frame-accurate
    gapless audio scenario.
    … It's been in Chromium for a long time.
    … We haven't had the cross-fade feature in Chromium as a
    result.
    … [more technical details not captured by scribe]
    … Not all implementations are required to have
    faster-than-real-time decoding capabilities, which may make it
    a concern for some devices and platforms.

    jernoble: Back to cross-fade, I don't think webkit does it
    either. Firefox?

    Jean-Yves: I don't think there is cross-fading either.

    Matt_Wolenetz: The spec says: if you don't do cross-fade, you
    need to drop. We don't do either, since we do frame-accurate
    splicing.

    Jean-Yves: I have only seen frame-accurate splicing in demo
    apps, not real ones.

    Matt_Wolenetz: There used to be a Google gapless music app.

    jernoble: Webkit will mark windows and overlaps, with samples
    to throw. That has led to some discontinuities.
    … I wonder if there's something that we can do in the case
    where the append is smaller than the codec window.
    … A microsecond is a gap. You'll hear a click when that
    happens.
    … We have gotten feedback from the hardware team as some of the
    recent hardware is pretty sensitive about that.
    … That's what the cross-fade was intended to address.

    Matt_Wolenetz: You can have both frame-accurate and cross-fade.

    karlt: Normative text says you need to drop the whole block and
    then in a non-normative note it says you can do something else.
    … It seems that all of the implementations keep blocks, perhaps
    that should be what the normative part of the spec says.

    Matt_Wolenetz: I agree.
    … We don't know whether TV devices
    … keep or drop the frames.

    Jean-Yves: No implementation ever stops on gaps.

    cpn: So we've got consistent behavior. Question about TV
    devices. Is that something that we should take up to TV people?
    … That would provide input on the feature detection part of the
    feature.

    Jean-Yves: Should we consider that feature detection as part of
    media capabilities instead of adding a mechanism in MSE that
    currently does not exist?
    … Media Capabilities has a way to test if MSE supports a
    particular system.

    Matt_Wolenetz: Question on software decoder. Media Capabilities
    covers more than MSE. If MSE can answer the question for
    itself, it seems preferable to do it in MSE.
    … The whole collection of issues that we're discussing are
    priority 2. That's because they're hard to test, hard to find
    the right behavior, etc.

    jernoble: If the use case is just "gapless labeling", that
    doesn't seem like a super compelling use case to me.

    Mark_Watson: gapless Audio splicing is also useful in a number
    of other scenarios.
    … What is happening today is not ideal, but it's better to
    leave it at what it is in order not to introduce differences of
    behavior.

    jernoble: Seems to match what Will Law has been asking for some
    time.
    … Making things observable without changing the behavior.

    Mark_Watson: It depends on whether we're talking about exposing
    more information about what browsers are already doing today,
    then yes that's useful, but I somehow already know that, even
    though it's better to know if that behavior is guaranteed.
    … Or whether we're talking about exposing information about a
    behavior that changed in browsers.

    Matt_Wolenetz: My assumption was that Chromium was the only
    browser doing frame-accurate splicing, but that seems wrong. So
    feature detection was meant to detect new browsers supporting
    the feature, but that may not be needed anymore.
    … For lots of splicing scenarios, it depends on the codec and
    within the codecs whether you're at the start or in the middle
    of a frame.
    … It doesn't seem that it would be a breaking change if an
    implementation starts doing this.
    … but it might be useful to tell the app that a browser
    supports this feature upfront.
    … If there are multiple behavior allowed by the spec, it's
    probably better to have a mechanism to detect the feature.

    cpn: It sounds like next step is around testing.

    Matt_Wolenetz: Yes, we seem to all be saying that we're doing
    some form of frame-accurate splicing.

     Relax timing constraint of initial HAVE_METADATA and later
     HAVE_CURRENT_DATA

    <ghurlbot> [12]Issue 275 Consider relaxing timing of initial
    HAVE_METADATA transitioning (wolenetz) agenda,
    TPAC-2022-discussion

      [12] https://github.com/w3c/media-source/issues/275

    <ghurlbot> [13]Issue 215 Spec is too rigid on requiring initial
    HAVE_CURRENT_DATA transition occur synchronously within coded
    frame processing (possibly ditto for HAVE_METADATA and init
    segment received processing) (wolenetz) interoperability,
    TPAC-2022-discussion

      [13] https://github.com/w3c/media-source/issues/215

    Matt_Wolenetz: In Chromium, we have a separate thread that
    holds decoding, buffering and so on.
    … To not block the main thread for some actions, we don't block
    the update and delivery of event scheduling while we wait for
    transitioning statuses.
    … In Segment parsing loop, I think, you're supposed to wait
    until HAVE_METADATA. We don't do that in Chrome. I'd like to
    relax the constraints in order not to do undue blocking of the
    main thread.

    jernoble: That seems reasonable to me.
    … Especially for workers.

    cpn: Last time we discussed, there was a question about
    gathering information on what different implementations
    actually do. Do we need that? It seems to me that if we're just
    relaxing the constraints, we can go ahead.

    Matt_Wolenetz: Question is whether it will bug applications if
    more implementations do what Chrome already does.
    … You may get duration information faster than ready
    information. That can surprise apps, but then Chrome has been
    doing that forever.
    … There's been some timing hiccups in the tests in WPT.
    … It sounds like it may be something to propose in a PR.
    … If you can see any kind of regression that this may create,
    please raise it.

    cpn: Checking in with some of the major player libraries might
    be useful.
    … So next step on this is a PR.

     API for interoperable gap playback/tolerance

    <ghurlbot> [14]Issue 160 Support playback through unbuffered
    ranges, and allow app to provide buffered gap tolerance
    (davemevans) feature request, TPAC-2022-discussion

      [14] https://github.com/w3c/media-source/issues/160

    Matt_Wolenetz: My time on MSE is limited right now.
    … As part of the HLS native implementation in Chrome that we're
    building on top of MSE concepts, we need to implement an
    internal API for interoperable gap playback/tolerance.
    … The problem right now is that we have some stalls happening
    based on various tolerance settings.

    jernoble: Some of the issues around gaps should be handled by
    modifying the output timeline, which is something we do for
    small gaps.

    Matt_Wolenetz: Now there's also out of order audio codecs.
    … This API came in discussions recently at FOMS, which I was
    unable to attend.

    jernoble: Every time people attempt to implement HLS in MSE, we
    discover new features that MSE is missing, so interested in
    your exploration.

     Reflections on MSE in Worker and WG review process

    Matt_Wolenetz: The issue that was encountered was late phase
    based on comments from Mozilla
    … transition from MediaSource Handle to a property.
    … I made the assumption that [missed] could be retained on the
    property. That was not true. We ended up with regression.
    … What we would be good would be more thorough reviews.
    … The more eyes on such features, the better.
    … No one can claim to be an expert on the whole web platform.
    … Internally, we've been reflecting on binding generators that
    could warn us when there are problems.

    cpn: I see there's like a TAG issue that you raised about this

    [15]TAG design principle issue

      [15] https://github.com/w3ctag/design-principles/issues/400

    Matt_Wolenetz: That's about it for MSE. I will be focusing on
    the HLS implementation in MSE.
    … Getting pre-emptive media source in the meantime would be
    great to unblock MSE in iOS.

    cpn: What would you like to do about remaining issues?

    Matt_Wolenetz: It depends on how much incoming feedback and
    comments we get on these.

    cpn: OK, we'll check with you before next call.

   Media Capabilities editors

    cpn: Mounir and Chris moved on to other things. So we need new
    editors.
    … I think Vi from Microsoft is the only one still around.

    Jean-Yves: I can help with some of that.

    cpn: Thanks for the offer. I'll send the call around to see if
    anyone is willing to help you.

Received on Wednesday, 9 November 2022 09:42:40 UTC