- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Tue, 25 Apr 2023 17:13:41 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi,
The minutes of our meeting on April 18 2023 are available at:
https://www.w3.org/2023/04/18-webrtc-minutes
with the YouTube recording at https://www.youtube.com/watch?v=VkBeQdbVjWs
Thanks Henrik for scribing!
Dom
WebRTC WG 2023-04-18
[2]Agenda.
[2] https://www.w3.org/2011/04/webrtc/wiki/April_18_2023
Attendees
Present
-
Regrets
Dom, Harald
Chair
Bernard, Jan-Ivar
Scribe
scribe
Contents
1. [3]PR 147: Add RTCRtpEncodingParameters.codec to change the
active codec (Florent)
2. [4]PR Stats/751: RTX and FEC stats are incomplete (Fippo)
3. [5]Issue 146: Exposing decode errors / SW fallback as an
Event (Bernard)
4. [6]Issue 170: Incompatible SVC Metadata (Bernard)
5. [7]Issue 93: MediaStreamTrack audio delay/glitch capture
stats (Henrik)
6. [8]PR 173 adding presentationTime to
RTCEncodedVideoFrameMetadata (Tony)
7. [9]WebRTC Combines Media and Transport
8. [10]WebRTC ICE improvements
9. [11]playoutDelay (Jan-Ivar)
10. [12]Issue 39: Solve user agent camera/microphone
double-mute
11. [13]Summary of resolutions
Meeting minutes
Recording: [14]https://www.youtube.com/watch?v=VkBeQdbVjWs
[14] https://www.youtube.com/watch?v=VkBeQdbVjWs
IFRAME:
[15]https://www.youtube.com/embed/VkBeQdbVjWs?enablejsapi=1&rel
=0&modestbranding=1
[15]
https://www.youtube.com/embed/VkBeQdbVjWs?enablejsapi=1&rel=0&modestbranding=1
Date: April 18, 2023
Slideset: [16]https://lists.w3.org/Archives/Public/www-archive/
2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf
[16]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf
[17]PR 147: Add RTCRtpEncodingParameters.codec to change the active
codec (Florent)
[17] https://github.com/w3c/webrtc-extensions/pull/147
[18][Slide 12]
[18]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=12
Florent: We failed to mention that this API could also be used
for audio. We intend this to also be audio. If you have any
objections please mention them now.
Jan-Ivar: Simulcast on audio is probably not on anyone’s table.
The use case for audio here is to not have to negotiate, just
want to clarify that.
If people use this API, what codec can they choose from? Can
they go outside what was negotiated?
Florent: No, only what is negotiated. Any negotiated codec
supported by the user agent.
Peter: If someone asks for something that is not negotiated,
what happens?
Florent: An exception is thrown, it’s part of the PR.
RESOLUTION: No objection, let’s include audio.
[19]PR Stats/751: RTX and FEC stats are incomplete (Fippo)
[19] https://github.com/w3c/webrtc-stats/issues/751
[20][Slide 13]
[20]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=13
Fippo: Outbound has RTX, not inbound. Inbound has FEC, outbound
does not except in provisional stats. Proposal: to add metrics
for both sides and merge once there is an implementation.
Jan-Ivar: No objection, but the slide is asking for more than
what is discussed on the slide. Is that correct? WebRTC stats
is in Candidate Recommendation. If not implemented we put them
in provisional. Are you providing implementations?
Fippo: Yes I’m providing implementations.
Jan-Ivar: No objection.
RESOLUTION: No objection, Fippo to make PRs: merge on main spec
if implementation is provided or provisional spec if no
implementation is provided.
[21]Issue 146: Exposing decode errors / SW fallback as an Event
(Bernard)
[21] https://github.com/w3c/webrtc-extensions/issues/146
[22][Slide 14]
[22]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=14
Bernard: In Media WG a useful distinction was made between data
errors and resource problems. HW encoders are more restrictive
than SW. Nothing to show the user but the developer has to
debug the bitstream.
If there is a resource problem we may want to prompt the user.
Application might re-acquire resources.
That’s my recommendation here. What do people think about
distinguishing these two?
Youenn: We already support SW fallback today, so that seems
good as-is. I don’t see any reason to notify the app. For
resource problem is seems similar to “intense CPU usage”. In
Safari we have a UI to show this, so again there I don’t see
what the application can do to help the user that the browser
cannot already do instead.
Jan-Ivar: Thumbs up.
Florent: There is also the case of the SW decoder not being
able to decode the data. That also needs to be handled.
HW to SW fallback: what if application wants to change codec
instead?
Jan-Ivar: It’s unclear to me if the only option to fall back to
SW. We should be clear about what problem we’re trying to
solve.
Henrik: The distinction makes sense but I don’t see how this
slide solves the original problem of letting the app change
codec if it doesn’t get HW if you’re saying SW fallback solves
the problem I don’t agree.
Bernard: An event that distinguishes between the two is the
proposal, app observable. In the Media working group we want to
get this information out of WebCodecs. Just to be clear this is
under investigation, it’s not perfect, but Eugene is trying to
address this.
Jan-Ivar: Because of the privacy issue I think we [are
hesitant]. (Scribe: not sure what was said here.)
Bernard: The benefit of this approach is that we don’t need
more info than this. It’s two bits of information.
Youenn: The privacy concerns is real.
RESOLUTION: Waiting for Media Working Group.
[23]Issue 170: Incompatible SVC Metadata (Bernard)
[23] https://github.com/w3c/webrtc-encoded-transform/issues/170
[24][Slide 15]
[24]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=15
[25][Slide 16]
[25]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=16
[26][Slide 17]
[26]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=17
[27][Slide 18]
[27]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=18
Bernard: What metadata do you need to handle things? Dropping
spatial layers is easy but adding them back is not easy because
of the way SVC works. Let me show a slide: diagram that shows
how a frame may not be decodable if a dependent frame was not
received for some reason. The same problem can happen if there
is a dependency between multiple frames, the first frame could
be lost. Need an unbroken chain of dependencies.
Bernard: Another problem is if a mobile device can receive all
frames, but may not be able to decode it because the frames are
too big to be able to handle.
Bernard: Receiver needs to quickly be able to decode the frames
and if a receive frame is necessary for a desired resolution
[28][Slide 19]
[28]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=19
[29][Slide 20]
[29]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=20
Bernard: We have a PR for EncodedVideoFrameMetadata for SVC
(see slide and link to detailed proposal).
… Peter investigated state in Chromium for WebCodecs.
Peter: Not all codecs implemented all of this. I don’t think
there was anything in there for chains.
Bernard: Question is why this stuff is in encoded transform if
it has not been implemented? Should we remove them?
Florent: Temporal and spatial index is implemented but it might
depend on the codec implementation. E.g. it could be in the VP8
frame descriptor but not in the AV1 bitstream but if you
provide it you should be able to get most of this. That doesn’t
mean there are no bugs. I am working on some tests for SVC in
Chrome.
Bernard: So we have this incompatibility between WebCodecs and
WebRTC metadata. How to make these compatible?
SUMMARY: Needs further investigation?
[30]Issue 93: MediaStreamTrack audio delay/glitch capture stats
(Henrik)
[30] https://github.com/w3c/mediacapture-extensions/issues/93
[31][Slide 21]
[31]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=21
Henrik: The following audio capture metrics were recently added
to getStats. But issues were filed and the metrics have been
marked Feature at risk. The following metrics were added (see
slide). The metrics are only applicable if the media source
represents an audio capture device. They allow calculating
important quality metrics not available elsewhere such as
glitches in audio or average capture delay.
[32][Slide 22]
[32]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=22
Henrik: [33]w3c/webrtc-stats#741 is that GUM is frequently used
without WebRTC, we should talk about audio frames rather than
audio samples. The stats spec actually means audio frames as
clarified in the terminology section. And lastly it was not
clear why audio may be dropped so we need to clarify this
happens when audio is not processed in a timely manner from the
real time audio device.
[33] https://github.com/w3c/webrtc-stats/issues/741
Henrik: So the main issue, other than clarifications, is that
the metrics are in getStats() but they may be useful outside of
WebRTC. The proposal is to move them to MediaStreamTrack. This
is similar to how we recently added video capture stats to
track.getFrameStats(). The second point is name bikeshedding,
if we should use the getFrameStats, rename it or add a new
method specific to audio capture stats.
Youenn: Makes sense. No strong opinion on names.
Jan-Ivar: Correct direction. I’m not sure why we didn’t just
call this track.getStats().
RESOLUTION: Moving them to MediaStreamTrack makes sense, Henrik
to write a PR.
[34]PR 173 adding presentationTime to RTCEncodedVideoFrameMetadata
(Tony)
[34] https://github.com/w3c/webrtc-encoded-transform/pull/173
[35][Slide 23]
[35]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=23
Tony: The PR suggests a presentation the two timestamps don’t
match.
Bernard: Any objections?
Jan-Ivar: We’re overall supportive, but there is a comment
about naming bikeshedding.
Bernard: We’re heading down a road where we’re recreating
WebCodecs.
Jan-Ivar: No objection that we need it but name needs to be
figured out.
Tony: Let’s continue on the PR.
WebRTC Combines Media and Transport
[36][Slide 26]
[36]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=26
[37][Slide 27]
[37]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=27
Bernard: WebRTC combines media and transport. Good and bad.
Good if it does what you want, but bad if you’re trying to do
different things like:
… * Some things can’t be expressed in SDP.
… * Difficult to support bleeding edge codecs
… What if we could separate media and transport?
[38][Slide 28]
[38]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=28
Peter: Tightly coupled, but what if we could cut in two? Then
the app can sit between “media” and “transport” parts and
tweak. See slide.
… The left part looks a lot like WebCodecs. The right part:
RtpTransport?
… RTP sender API is “stream of packets in; stream of packets
out on wire”. RTP receiver is “stream of packets received on
the wire, stream of packets out”. RtpTransport.
… Here is how we could allow constructing these independently
of a PeerConnection: (slide).
… This would allow the app to bring its own packetization,
jitter buffer and rate adaptation.
… This is similar to WebTransport with datagrams. But it can
also be P2P and latency-sensitive congestion control.
RtpTransport would solve the following use cases:
… NV07, NV08, NV09, NV15, (non-consensus ones) NV38, NV39,
NV40,
… See slides with WebIDL.
… If we wanted to do jitter buffers there is more we could do,
but focus right now is RtpTransport. Is this a good direction
to go, giving the application more controls for transport?
Jan-Ivar: NV use cases kind of pre-dates WebTransport. I think
a lot of these use cases have already been met. Some of these
use cases are not actually use cases but requirements for use
cases. I feel like I’ve already objected to this.
Bernard: No, the NV use cases not labeled “non-consensus” ones
don’t have consensus but the other ones do have WG consensus.
Jan-Ivar: There are now two implementations of WebTransport.
I’d like to push back. It seems a bit early to discuss WebIDL.
Youenn: We see a lot of people trying to push a lot of stuff.
There is a desire to have an ICE transport. I think it is the
right approach. Are there sufficient use cases to warrant this?
That is a good question, and we should spend some time there.
But if we can gather enough use cases…
Florent: Regarding overlap with WebTransport, I don’t believe
it will work in most cases that involve peer to peer, since
WebTransport only works between browser and server, not browser
and browser.
Randell: I think this is interesting. I have concerns similar
but not identical to Jan-Ivar. It does seem easy for something
to violate bandwidth constraints. Perhaps there would be
mechanisms in place to block that somehow. It does smack me as
being very low level and it is inviting applications to
implement their own implementation of WebRTC. Doing it is one
thing, but doing it well is not so easy. I am a little
concerned about that, so I think it would be worth discussing a
little more what use cases this would enable and what
alternatives there might be to resolve those use cases. I have
some reservations about the details here. There are two issues
with WebTransport: one, as Florent mentioned, it is only client
to server. That’s not to say it couldn’t be, but that would be
additional work to get there. The second issue is that
congestion control currently implemented for WebTransport is
not real-time friendly, but that could be fixed with a “I want
real-time congestion” flag.
Peter: I’m a fan of WebTransport and I would like to see it
solve both P2P and CC. But it will never have the same interop
with endpoints. So even if WebTransport becomes everything we
want, I think there will still be a need for web apps to have
RTP controls. The reason I mentioned jitter buffer and
packetization is that these are the main thing for somebody to
implement. A high quality jitter buffer is not an easy task. I
have a proposal for actually. Part of the detailed design I
have for this addresses how the browser could stop the
application from screwing things up. Related to that, sequence
number reuse I do have a solution for that too. But the overall
question is if we want to have these discussions?
Randell: All things being equal, if I had a choice, I would
prefer to solve this via WebTransport. Perhaps there are some
cases… but I’d rather solve it if we can. I want to get
clarification if this cannot be solved via WebTransport for the
use cases we decide care about.
Bernard: The IETF will not use WebTransport for peer-to-peer.
This working group decided not to work on QUIC. Anyway the
protocol question is for IETF and it is not clear that they
will solve this.
Jan-Ivar: This would only provide a data channel, which we
already have. So are there peer-to-peer use cases that are not
already solved through existing technology? You can already use
VideoTrackGenerator to send data in this. If there are not
enough use cases this sounds like a premature optimization.
Peter: Regarding use cases - we’ve been talking about them
forever. And we’ve never gotten around to making solid
proposal. It has been far too long, this is a shame. We’re just
talking about talking about talking about use cases. What I’m
proposing is a solution to all of these and more. We’re not
getting anywhere talking about use cases.
Youenn: There was talk about being able to replace WebRTC. It
would be great to collect exactly what people are trying to
solve. If we have enough RTP use cases we should solve them
instead of trying to solve them via other APIs. That should not
be the path forward.
Bernard: This would need some time. We might want to consider
this for TPAC or future meetings.
WebRTC ICE improvements
Sameer: Continuing our discussions on ICE improvements.
… Last time we had 3 proposals. Feedback: Can we split this
into increments? This is what Peter and I have been trying to
do. Harmonize into a single proposal with a lot of common
grounds. Meets all the NV requirements.
… Order of incremental improvements…
… 1-3: candidate pair controls
… 4-5: low latency
… 6-9: ICE connectivity checks, controlling timing and
observing checks
… 10-11: candidate gathering
… 12-13: prevent premature pruning or custom pruning
… 14-15: ICE without peer connection and forking
… Peter will talk about the API shapes.
Peter: Cancellable events versus direct control: A and B on the
slides. For today just pick the option you like, we can decide
that later, judge the API as a whole based on the one you like.
… Lots of WebIDL. See slides!
Bernard: If you’re passing an ICE gatherer, I guess you’re
max-bundle?
Peter: If you’re willing to do ICE forking, you’re probably
willing to do max-bundle.
Jan-Ivar: I do think we want to go in this direction. When it
comes down to cancellable events, I think they’re semantics may
make sense. There is a valuable pattern. On the other hand if
the application wants to remove a pair not in response to
something, there is value to that too. So if you could agree.
… It’s hard to imagine how the JavaScript would look when using
all of these, but it does seem to me like a lot of defining
custom events. I imagine we only need to create custom events
in certain cases.
Peter: It might be possible to instead have attributes or
something.
Jan-Ivar: But overall it’s a good direction.
Youenn: It’s a long list of interfaces, so I cannot sign on all
interfaces. But it is good to have a path forward that seems
pretty clear. I am hoping that the first API bits are the ones
that developers will use first, because then we can start
prototyping and shipping more. If the first three adds value we
have more motivation to implement more. So let’s dig into that
and start being nit-picky about the design and so on.
Peter: We made the list in the order that we think the
developers are asking for the most. One option is to really
nail down those three and then work from there.
Sameer: Regarding how does the JS look, for the new proposal I
don’t have a full example yet, but on my github I do have an
example of how the old API being used look. It is slightly
different of course, but it should give some view of how this
might look like in general.
… Regarding cancellable events versus the other approach: we
would still have explicit methods for specific actions. The
difference is only between how to affect the default behavior:
do you decide at one time or do you decide when the event is
firing?
Jan-Ivar: Reactionary versus method seems different uses.
Peter: The method to remove a candidate pair, for example,
exists in both proposals. The difference is just with how to
prevent the default action.
… Examples. Get down and dirty.
RESOLUTION: The working group supports the direction this is
taking. Sameer/Peter to create PRs to solve things piece by
piece, starting the the most important use cases (first few
bullet points). As we progress we’re expecting to see reason to
continue down the list and spec + implement more.
[39]playoutDelay (Jan-Ivar)
[39] https://github.com/w3c/webrtc-extensions/issues/156
[40][Slide 81]
[40]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=81
Jan-Ivar: Chrome has implemented this as “playoutDelayHint”.
Multiple issues have been raised. Firefox is now trying to
implement and address issues. Questions:
… * Jitter buffer delay OR jitter buffer plus playout delay?
… * Milliseconds versus seconds?
… * How to test?
… * The positive goal should be jitter-free media.
… Delay is a measure of a negative side-effect, it is vague. It
makes it hard to test and confusing for implementers. Chrome is
inconsistent.
[41][Slide 82]
[41]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=82
Jan-Ivar: Proposal: jitterBufferDelay = value in milliseconds.
Let the application compare this to getStats.
Henrik: I like everything you are presenting and think this is
how we should have done it in the first place. But I just want
to point out that I think the jitter buffer issue in Chrome is
a bug in getStats() rather than a bug in playoutDelay, so if
this is essentially already implemented under a different name,
is it really worth the migration effort to change name versus
changing the spec?
Jan-Ivar: It also says to throw an exception which Chrome
doesn’t do. If you set this to 20 seconds that would change
what this API can be used for. WebRTC is used for real-time and
20 seconds is not real time.
Henrik: My understanding is that Chrome clamps. So if you set
to 20 it might not throw but you still only get 4 seconds, or
whatever the max value is. I think. I’m not saying it’s better
but overall it seems like the difference between the spec and
the implementation is rather nit picky and I’m wondering if it
is worth having to migrate.
Jan-Ivar: But hint sounds like this is optional. We want to
have control surfaces that we can test.
Henrik: I agree it’s just if it’s worth the extra effort. I
mean, I’m not going to object if we go down this route but. I
have a feeling this will come back on my table. Heh.
Jan-Ivar: Anyone else? So is that it?
RESOLUTION: Let’s go in Jan-Ivar’s direction. Jan-Ivar to make
PRs.
[42]Issue 39: Solve user agent camera/microphone double-mute
[42] https://github.com/w3c/mediacapture-extensions/issues/39
[43][Slide 85]
[43]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=85
Youenn: The user agent can pause the camera in Safari. This
affects the muted state. But applications tend to implement
their own mute function, so it would be good if there was a way
for the website and UA to sync.
[44][Slide 86]
[44]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=86
Youenn: OS side indicator is widely deployed in the UX. So when
websites really want to mute they tend to stop() the track
rather than mute.
… OS level microphone: application tends to not stop() the
track in this case for speech detection (“are you speaking?”
hint)
… Proposal: allow application to request to mute/unmute
capture.
[45][Slide 87]
[45]
https://lists.w3.org/Archives/Public/www-archive/2023Apr/att-0001/WEBRTCWG-2023-04-18.pdf#page=87
Youenn: It could be placed on the MediaStreamTrack,
InputDeviceInfo or Navigator. I would personally prefer on the
MediaStreamTrack or InputDeviceInfo.
… Thoughts?
Jan-Ivar: I’m supportive but there are some privacy concerns. I
think the privacy concerns need to be met which I think means
requiring user activation. My preference would be to put it on
the track. Bikeshed preference would be to simply call it
mute() instead of requestMute().
Youenn: Maybe muting happened a day ago, the user may need to
accept prompt again. This is why it returns a promise.
Jan-Ivar: What is the use case of mute?
Youenn: People tend to clone tracks, what you actually want to
do is mute the source, it is error prone to hunt down on all
tracks.
Jan-Ivar: If I’m transferring a track and there is a clone, I
could also request to mute?
Youenn: Muting on one could mute all clones. If one website
wants to mute it is good if we could do that roughly at the
source level. Hence the InputDeviceInfo proposal.
Jan-Ivar: I want on the track, and focus on mute. Not as clear
about unmute.
RESOLUTION: No objections.
Summary of resolutions
1. [46]No objection, let’s include audio.
2. [47]No objection, Fippo to make PRs: merge on main spec if
implementation is provided or provisional spec if no
implementation is provided.
3. [48]Waiting for Media Working Group.
4. [49]Moving them to MediaStreamTrack makes sense, Henrik to
write a PR.
5. [50]The working group supports the direction this is
taking. Sameer/Peter to create PRs to solve things piece by
piece, starting the the most important use cases (first few
bullet points). As we progress we’re expecting to see
reason to continue down the list and spec + implement more.
6. [51]Let’s go in Jan-Ivar’s direction. Jan-Ivar to make PRs.
7. [52]No objections.
Received on Tuesday, 25 April 2023 15:13:45 UTC