- From: Francois Daoust <fd@w3.org>
- Date: Wed, 09 Nov 2022 09:42:36 +0000
- To: "public-media-wg@w3.org" <public-media-wg@w3.org>
Hi all,
The minutes of the last Media WG call are available at:
https://www.w3.org/2022/11/08-mediawg-minutes.html
... and copied as raw text below. Precise technical points around MSE
window append behaviors went too fast for scribe to both understand and
capture. Feel free to complete as needed or raise comments on related
GitHub issues!
Thanks,
Francois.
-----
Media WG
08 November 2022
[2]Agenda. [3]IRC log.
[2]
https://github.com/w3c/media-wg/blob/main/meetings/2022-11-08-Media_Working_Group_Teleconference-agenda.md#agenda
[3] https://www.w3.org/2022/11/08-mediawg-irc
Attendees
Present
Chris Needham, Eric Carlson, Francois Daoust, Frank
Liberato, Greg Freedman, Jean-Yves Avenard, Jer Noble,
Joey Parrish, Karl Tomlinson, Mark Watson, Matt
Wolenetz, Sushanth Rajasankar
Regrets
-
Chair
Chris Needham
Scribe
cpn, tidoust
Contents
1. [4]Agenda bashing
2. [5]MSE Interoperability/implementation conformance issues
1. [6]Gapless audio support feature detection
2. [7]Relax timing constraint of initial HAVE_METADATA
and later HAVE_CURRENT_DATA
3. [8]API for interoperable gap playback/tolerance
4. [9]Reflections on MSE in Worker and WG review process
3. [10]Media Capabilities editors
Meeting minutes
Agenda bashing
cpn: Main topic is MSE Interoperability/implementation
conformance issues, including perhaps reflections on changes
that needed to be brought to MSE in worker
… Also topic on media capabilities towards the end of the call
MSE Interoperability/implementation conformance issues
Gapless audio support feature detection
<ghurlbot> [11]Issue 37 Support sample accurate audio splicing
using timestampOffset/appendWindowStart/appendWindowEnd
(Melatonin64) feature request, needs author input,
TPAC-2022-discussion
[11] https://github.com/w3c/media-source/issues/37
Matt_Wolenetz: When the presentation coded frame of two audio
fragments overlap, I believe the spec describes a cross-fade.
Chrome used to have that, but the widespread proliferation of
incorrect timing and codec formats
… meant that we now truncate.
… We couldn't perceive any benefit with cross-fade.
… Meanwhile, chromium implementation allows to do frame
accurate splicing.
… In order to do that, you must not drop everything that was
decoded from both audio frames.
… And perform trimming.
… Issue is whether we can standardize this feature:
frame-accurate audio frame post-decoding splicing
… And also can we add feature detection for whether this is
supported by the browser
Jean-Yves: In Firefox, the data will be trimmed once decoded.
More accuracy than the 1048 sample.
Matt_Wolenetz: You need to feed more than just the frame so
that at the splice point, this is stable.
… Yes, that is what I'm talking about.
Jean-Yves: Multiple encodings use different flags. You end up
with conflicts, e.g. with ADTS.
… Is it something that will be within MSE, like a message to
send, or contained within the binary format that we append.
Matt_Wolenetz: In terms of specs, yes. In terms of the binary
format, the decoded sequence needs to be appended in a
well-formed sequence by the app, so that the splicing point
gets clearly set.
… The app needs to add enough of the B frames so that the
splicing can happen smoothly.
Jean-Yves: The way MSE is designed is that it's always dealing
with compressed frames rather than decoded frames.
Matt_Wolenetz: We check in Chromium append calls for positions
where splicing might append, especially for audio.
… And we keep track of them.
… We decode across the splice point faster than real-time so
that we can adjust things. That enables the frame-accurate
gapless audio scenario.
… It's been in Chromium for a long time.
… We haven't had the cross-fade feature in Chromium as a
result.
… [more technical details not captured by scribe]
… Not all implementations are required to have
faster-than-real-time decoding capabilities, which may make it
a concern for some devices and platforms.
jernoble: Back to cross-fade, I don't think webkit does it
either. Firefox?
Jean-Yves: I don't think there is cross-fading either.
Matt_Wolenetz: The spec says: if you don't do cross-fade, you
need to drop. We don't do either, since we do frame-accurate
splicing.
Jean-Yves: I have only seen frame-accurate splicing in demo
apps, not real ones.
Matt_Wolenetz: There used to be a Google gapless music app.
jernoble: Webkit will mark windows and overlaps, with samples
to throw. That has led to some discontinuities.
… I wonder if there's something that we can do in the case
where the append is smaller than the codec window.
… A microsecond is a gap. You'll hear a click when that
happens.
… We have gotten feedback from the hardware team as some of the
recent hardware is pretty sensitive about that.
… That's what the cross-fade was intended to address.
Matt_Wolenetz: You can have both frame-accurate and cross-fade.
karlt: Normative text says you need to drop the whole block and
then in a non-normative note it says you can do something else.
… It seems that all of the implementations keep blocks, perhaps
that should be what the normative part of the spec says.
Matt_Wolenetz: I agree.
… We don't know whether TV devices
… keep or drop the frames.
Jean-Yves: No implementation ever stops on gaps.
cpn: So we've got consistent behavior. Question about TV
devices. Is that something that we should take up to TV people?
… That would provide input on the feature detection part of the
feature.
Jean-Yves: Should we consider that feature detection as part of
media capabilities instead of adding a mechanism in MSE that
currently does not exist?
… Media Capabilities has a way to test if MSE supports a
particular system.
Matt_Wolenetz: Question on software decoder. Media Capabilities
covers more than MSE. If MSE can answer the question for
itself, it seems preferable to do it in MSE.
… The whole collection of issues that we're discussing are
priority 2. That's because they're hard to test, hard to find
the right behavior, etc.
jernoble: If the use case is just "gapless labeling", that
doesn't seem like a super compelling use case to me.
Mark_Watson: gapless Audio splicing is also useful in a number
of other scenarios.
… What is happening today is not ideal, but it's better to
leave it at what it is in order not to introduce differences of
behavior.
jernoble: Seems to match what Will Law has been asking for some
time.
… Making things observable without changing the behavior.
Mark_Watson: It depends on whether we're talking about exposing
more information about what browsers are already doing today,
then yes that's useful, but I somehow already know that, even
though it's better to know if that behavior is guaranteed.
… Or whether we're talking about exposing information about a
behavior that changed in browsers.
Matt_Wolenetz: My assumption was that Chromium was the only
browser doing frame-accurate splicing, but that seems wrong. So
feature detection was meant to detect new browsers supporting
the feature, but that may not be needed anymore.
… For lots of splicing scenarios, it depends on the codec and
within the codecs whether you're at the start or in the middle
of a frame.
… It doesn't seem that it would be a breaking change if an
implementation starts doing this.
… but it might be useful to tell the app that a browser
supports this feature upfront.
… If there are multiple behavior allowed by the spec, it's
probably better to have a mechanism to detect the feature.
cpn: It sounds like next step is around testing.
Matt_Wolenetz: Yes, we seem to all be saying that we're doing
some form of frame-accurate splicing.
Relax timing constraint of initial HAVE_METADATA and later
HAVE_CURRENT_DATA
<ghurlbot> [12]Issue 275 Consider relaxing timing of initial
HAVE_METADATA transitioning (wolenetz) agenda,
TPAC-2022-discussion
[12] https://github.com/w3c/media-source/issues/275
<ghurlbot> [13]Issue 215 Spec is too rigid on requiring initial
HAVE_CURRENT_DATA transition occur synchronously within coded
frame processing (possibly ditto for HAVE_METADATA and init
segment received processing) (wolenetz) interoperability,
TPAC-2022-discussion
[13] https://github.com/w3c/media-source/issues/215
Matt_Wolenetz: In Chromium, we have a separate thread that
holds decoding, buffering and so on.
… To not block the main thread for some actions, we don't block
the update and delivery of event scheduling while we wait for
transitioning statuses.
… In Segment parsing loop, I think, you're supposed to wait
until HAVE_METADATA. We don't do that in Chrome. I'd like to
relax the constraints in order not to do undue blocking of the
main thread.
jernoble: That seems reasonable to me.
… Especially for workers.
cpn: Last time we discussed, there was a question about
gathering information on what different implementations
actually do. Do we need that? It seems to me that if we're just
relaxing the constraints, we can go ahead.
Matt_Wolenetz: Question is whether it will bug applications if
more implementations do what Chrome already does.
… You may get duration information faster than ready
information. That can surprise apps, but then Chrome has been
doing that forever.
… There's been some timing hiccups in the tests in WPT.
… It sounds like it may be something to propose in a PR.
… If you can see any kind of regression that this may create,
please raise it.
cpn: Checking in with some of the major player libraries might
be useful.
… So next step on this is a PR.
API for interoperable gap playback/tolerance
<ghurlbot> [14]Issue 160 Support playback through unbuffered
ranges, and allow app to provide buffered gap tolerance
(davemevans) feature request, TPAC-2022-discussion
[14] https://github.com/w3c/media-source/issues/160
Matt_Wolenetz: My time on MSE is limited right now.
… As part of the HLS native implementation in Chrome that we're
building on top of MSE concepts, we need to implement an
internal API for interoperable gap playback/tolerance.
… The problem right now is that we have some stalls happening
based on various tolerance settings.
jernoble: Some of the issues around gaps should be handled by
modifying the output timeline, which is something we do for
small gaps.
Matt_Wolenetz: Now there's also out of order audio codecs.
… This API came in discussions recently at FOMS, which I was
unable to attend.
jernoble: Every time people attempt to implement HLS in MSE, we
discover new features that MSE is missing, so interested in
your exploration.
Reflections on MSE in Worker and WG review process
Matt_Wolenetz: The issue that was encountered was late phase
based on comments from Mozilla
… transition from MediaSource Handle to a property.
… I made the assumption that [missed] could be retained on the
property. That was not true. We ended up with regression.
… What we would be good would be more thorough reviews.
… The more eyes on such features, the better.
… No one can claim to be an expert on the whole web platform.
… Internally, we've been reflecting on binding generators that
could warn us when there are problems.
cpn: I see there's like a TAG issue that you raised about this
[15]TAG design principle issue
[15] https://github.com/w3ctag/design-principles/issues/400
Matt_Wolenetz: That's about it for MSE. I will be focusing on
the HLS implementation in MSE.
… Getting pre-emptive media source in the meantime would be
great to unblock MSE in iOS.
cpn: What would you like to do about remaining issues?
Matt_Wolenetz: It depends on how much incoming feedback and
comments we get on these.
cpn: OK, we'll check with you before next call.
Media Capabilities editors
cpn: Mounir and Chris moved on to other things. So we need new
editors.
… I think Vi from Microsoft is the only one still around.
Jean-Yves: I can help with some of that.
cpn: Thanks for the offer. I'll send the call around to see if
anyone is willing to help you.
Received on Wednesday, 9 November 2022 09:42:40 UTC