[Minutes] Media WG Teleconference - 2021-09-28

Hi all,

The minutes of yesterday's call, focused on the ISO BMFF Byte Stream 
Format specification, are available at:

...


Media WG Teleconference
28 September 2021

           Bernard Aboba, Chris Needham, Cyril Concolato, Eric
           Carlson, Francois Daoust, Jan-Ivar Bruaroey, Jer Noble,
           Mark Watson, Matt Wolenetz, Peng Liu, Yuhao Fu

           Chris, Jer



     1. [4]TPAC planning: WebRTC WG / Media WG joint meeting
     2. [5]MSE Bytestream Format

Meeting minutes

   TPAC planning: WebRTC WG / Media WG joint meeting

    Chris: Topics for WebRTC joint meeting?

    Jer: Media capabilities and harmonising with RTC

    Jan-Ivar: Main topic is MediaCapture Transform or its
    replacement, about exposing real-time audio and video between
    MediaStreamTrack and JS
    … Now it seems will be Streams based. Some issues open around
    … Specific to WebCodecs are video frame lifetime, close and
    clone methods
    … GC cleanup. If you do a tee to branch a stream into two,
    there's no cloning by default
    … When one branch closes the video frame it stops working from
    the other. Want to discuss also audio issues
    … We invited the Audio WG to also join

    Bernard: TPAC is a good time for overviews or relationships
    between things
    … So the overall direction we're going with media, using
    Streams to create pipelines to process media
    … Future of content protection. S-Frame in WebRTC is an
    encrypted content form, doesn't work with WebCodecs
    … Questions about transports. Looking at the overview, what's
    missing, how does it all fit together for developers?
    … We get lots of questions from developers. How to render:
    WebGPU, Canvas, etc. Any recommendations?
    … Coherent view on Workers

    Chris: Anything on WebTransport specifically?

    Bernard: WT supports workers, RTCDataChannel doesn't, but
    that's proposed, it's an extension spec. PeerConnection doesn't
    support workers
    … So lots of different things, good to look at the overall
    picture. What are we not doing?

    Jan-Ivar: Opportunity to look at the overall picture,
    Alternative to stand up an alternative to WebRTC using the
    other APIs. Looking at how that fits together and what does and
    doesn't work

    Bernard: Developers ask how all the parts fit together. Could
    present what we think the overview is, see where there's
    agreement. Does it make sense?

    Jan-Ivar: We'll make slides in WebRTC WG. Media WG can
    contribute to that
    … A question could be: If a source and sink are sink-based, why
    not also the bit in the middle?
    … Why use streams instead of promises for media capture
    … Why isn't encode promise based?
    … The streams model would be able to handle it.

    Bernard: We could present the story, issues, questions people
    ask. Streaming and RTC are converging, low latency streaming
    … Using EME with WebRTC doesn't work today. Could put some
    examples in

    Jer: Can you come up with the overview?

    Bernard: That's for the first hour, then audio for the second

   MSE Bytestream Format

    Cyril: I opened 4 issues
    … Generally, the intent was to read and understand the spec,
    check for conflicts with other specs, if any
    … And check interop across browsers. I also wrote some unit
    tests, hand-editing MP4 files and feeding to an MSE based
    … I found some surprising results
    … I'm migrating the tests to WPT
    … [6]Issue #4 ftyp box

       [6] https://github.com/w3c/mse-byte-stream-format-isobmff/issues/4

    Cyril: Wording in the BSF spec says the ftyp box is part of the
    init segment, and the UA should run the error algorithm
    … if there's a mismatch. It seems to ask a lot from browsers
    … Requires the browser to validate. In my tests, I found the
    ftyp box is ignored. If I use an init segment with a moof box
    it's fine
    … I can put anything in the ftyp box and everything is accepted
    … My understanding of what's implemented today is that we
    shouldn't say anything about the ftyp box
    … Just say an init segment it just a moof box with some
    … There's a similar statement about the segment type box that
    could be safely ignored

    Jer: Wasn't there a move in the MEIG to have a restrictive ftyp
    that would throw errors in cases where extra boxes would be
    ignored. If we remove this, will there be a request to add it

    ChrisN: The CMAF BSF discussion

    Cyril: I argued against that. If all browsers implement the
    same thing, why take it out?

    Jer: I agree. It's that they didn't have separate mime type so
    wanted to use ftyp
    … I think if we can parse the file, we should, and not throw
    errors. Being relaxed in error handling is consistent with the
    web approach
    … I'm fine with adding it to the list of boxes to ignore
    … But concerned that others will object

    Cyril: I doubt browsers will verify conformance to brands.
    Browsers and players try to do their best with the content

    Jer: Making WPTs would answer these questions. Update the spec
    to match browser behaviour

    Cyril: If a box contains a compatible brand the UA doesn't
    support, it should fail. That's the opposite of what ISO BMFF

    Jer: I wonder if what's written was the opposite of the intent

    Matt: I agree that the ftyp box is superfluous in Chrome
    … Is there a case where folks want to play streams but they
    want capability detection based on ftyp?
    … As no browsers filter on this, we should remove it from the
    spec. It's in the list of boxes that should be skipped in

    Cyril: Next issue, [7]support for edit lists

       [7] https://github.com/w3c/mse-byte-stream-format-isobmff/issues/5

    Cyril: The spec currently says browsers must support one type
    of edit lists
    … An offset edit list, which offsets the composition time when
    you have B-frames
    … The edit list maps the non-zero composition time to a
    presentation time of zero
    … The spec is silent on other types of edit lists
    … Can you use fractional or zero rate? Can you use empty or
    multiple entries in edit list?
    … Rare to have interop in those tests. Mostly browsers ignore
    edit lists not supported. I think it should fire an error
    … Could lead to A/V sync issues

    Matt: I have a concern about raising a decode or parse error on
    content that previously played successfully
    … Could be a note for clarification on which edit lists have
    interoperable support, and others would be ignored

    Cyril: I tested fractional rates, empty edit list (should fill
    the timeline with a gap)
    … Maybe deprecation first, then removal in a future edition

    Matt: Are there components of these edit lists used with other
    parts of MSE, timestampOffset, playbackRate, so MSE couldn't
    afford applications polyfilling

    Jer: Hard to polyfill with muxed tracks

    Matt: Do we have any stats on existence of these kinds of edit

    Cyril: There's one that must be supported

    Matt: So a note to say the others should be ignored by

    Cyril: I'd prefer to say "may be" ignored, and content
    providers "should not" use

    Matt: Makes sense, also gather data

    Jer: Other uses? Offset is needed for B-frames. What about
    multiple playback rates, other use cases?

    Cyril: Empty edits could be used, when you want to align audio
    and video, you can either remove some video content to start at
    the audio start
    … or say the audio has a gap, and the player should play video
    without audio until audio starts

    Jer: Another option, if we find that these are being used in
    the wild, could add a "should" statement for the empty edits,
    or others with valid use cases

    Matt: We don't have enough data now. If there are use cases
    that can't be solved ergonomically in the MSE API, can address
    at a later time
    … If we don't see people complain that playback isn't working,
    should we then add telemetry?

    Cyril: I think it's important to document what content creators
    can rely on

    Matt: so documenting it may be ingored, and content providers
    should not be used. File a github issue so people can reply to
    bring to our attention

    Cyril: Next is #6, [8]support for unknown boxes. Boxes accepted
    and ignored
    … Not sure what is meant by valid top-level boxes

       [8] https://github.com/w3c/mse-byte-stream-format-isobmff/issues/6

    Matt: If you put an out of order box, such as a moof before a
    moov. The spec handles that, but are there other cases?

    Cyril: ... That's the [9]next issue on the number and order of

       [9] https://github.com/w3c/mse-byte-stream-format-isobmff/issues/7

    Cyril: I tested a unkn box

    Matt: Is that in the spec, how do we know its a top-level box?

    Cyril: From where it's placed in the stream

    Matt: If it's not defined as a top-level box in the normative

    Mark: All the boxes that the ISO spec says are allow to appear
    at the top level

    Cyril: Anyone can add other boxes at the top level if they want

    Matt: If we need to bind the MSE spec more closely to ISO BMFF,
    we can

    Cyril: Is the intent to ignore unknown boxes at the top level?

    Matt: It could indicate the stream is malformed

    Cyril: Concerned it doesn't scale. Each time ISOBMFF spec
    changes, you'd have to change implementation
    … It has happened before

    Jer: If a box not defined at top level is found at top level,
    could through an error. But it's OK to skip an unknown box

    Cyril: Just consume the bytes and continue parsing

    Matt: Could there be a malformed stream that causes the
    implementation to hold onto large blocks of data?

    Jer: Seems like an implementation detail

    Matt: Some implementations may see 2 gigabytes as too large and
    couldn't skip
    … We have quota exceeded mechansim. Just thinking through
    implementation based considerations
    … In terms of API usage, there was one user-defined box,
    proposed by the BSF, but that source was unaware of
    pre-existing top level boxes they could have used, JS level

    Cyril: The unkn box is one I invented
    … ISO BMFF recently introduced compressed boxes (gzip). The
    sidx can be replaced with !sdx, defined in ISO BMFF

    Cyril: Let's discuss how to make the spec changes
    … Should I open an issue for each problem, then a PR. Or
    propose a rewrite as a draft and review the whole thing?

    Matt: Depends on scale. If just a few, disuss as one offs
    … Design principles about not wanting to regress

    Cyril: What about small issues? I'd like to be able to rewrite
    the text and review as a whole
    … Agree on the intent of the issues, than make a PR

    Jer: Seems reasonable to close a number of issues in one PR

    Matt: An issue per item sounds good, and a PR that addresses

    Cyril: The BSF is a Note. Why is it not on the Rec track? If
    there are tests and implementations can be compliant to it, why
    a Note?

    Matt: We focused on testing MSE itself and not so much the
    BSFs. An implementation must support *a* BSF, so
    implementations may support different ones
    … Allowed more flexibility at the time

    Francois: Another reason it's a note relates to patents. It
    more directly relates to codecs, so didn't need to ask about
    the royalty free patent policy

    ChrisN: What about Process 2021, provides a structure for
    registries and entries?

    Matt: WebCodecs has taken a similar approach to MSE for
    registries. Anything we can learn from that?

    Francois: Similar reasons, would make sense to have them as Rec
    track specs

    Matt: It's been easy to propose and support new entries fairly

    Cyril: I think it's fine if the entries aren't all at same
    maturity level

    Matt: I'd need to check with colleagues on that

    Cyril: I created WPT for this spec. It may need some more work.
    Is there a link between WPT and this WG?

    Matt: Existing tests were bound to the API itself, not so much
    a specific BSF. The tests are mostly testing the API for a
    supported format
    … Testing BSF format more deeply is good, put into a subfolder

    Cyril: I'll start a PR. How do you detect an error? Buffer
    range, error event, etc?

    Matt: There's a proposed introspection API that could help with

    Cyril: Thank you

    Matt: FPWD of MSE v2 and short name

    Francois: It'll be published on Thursday, no need for a new CfC

    Matt: Update the SoTD?

    Francois: Yes, feel free to do that. In future we'll switch to
    audomatic publishing to /TR

    Chris: Second screen WG would like a joint meeting to talk
    through some issues around capability detection
    … Will schedule that for an upcoming call, possibly next time?


