- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Thu, 18 Sep 2025 14:13:48 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi, The minutes of our call held on Tuesday are available at: https://www.w3.org/2025/09/16-webrtc-minutes.html and copied as text below. Dom WebRTC September 2025 meeting 16 September 2025 [2]Agenda. [3]IRC log. [2] https://www.w3.org/2011/04/webrtc/wiki/September_16_2025 [3] https://www.w3.org/2025/09/16-webrtc-irc Attendees Present Carine, DiegoPerezBotero, dom, Fippo, Guido, harald, Jan-Ivar, JasperHugo, KonradHofbauer, NishitaDey, PeterT, SergeySilkin, SteveBecker, SunShin, TimP, TonyHerre, Youenn Regrets - Chair Guido, Jan-Ivar, Youenn Scribe dom Contents 1. [4]Audio Output Devices API 1. [5]Ask for user gesture to call setSinkId #84 2. [6]Should media capture output define an explicit default speaker device? #151 3. [7]Expose the type of device in MediaDeviceInfo #1 4. [8]Speaker devices may not always work with all microphones #149 2. [9]Decoder exposure and software fallback 3. [10]generateKeyFrame() API consolidation (Jan-Ivar) 1. [11]Issue #273 / PR #274: Remove sender.generateKeyFrame() 2. [12]Issue #147: expose rid as metadata on outgoing frames 3. [13]PR #276: Default the generate key frame algorithm to all layers 4. [14]Issue 143: should transform.generateKeyFrame() take an array of rids? 4. [15]RTCDataChannel (SDP and stats) 1. [16]Always negotiate datachannels 2. [17]What is the lifetime of stats? 3. [18]data channel ids set before SCTP init #3071 5. [19]Bring Your Own Degradation Adaptation 6. [20]SFrameEncrypterStream rename 7. [21]Summary of resolutions Meeting minutes Slideset: [22]https://docs.google.com/presentation/d/ 11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/ and [23]archived PDF copy [22] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/ [24]Audio Output Devices API [24] https://github.com/w3c/mediacapture-output/ Ask for user gesture to call setSinkId [25]#84 [25] https://github.com/w3c/mediacapture-output/issues/84 [26][Slide 10] [26] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#10 Jan-Ivar: a bit concerned that this adds variance among implementations and might lead to compat issues … what's the motivation for this change? Youenn: similar to gUM or gDM: when a web site starts playing audio, we put a user activation check - which I think the media spec requires … setSinkId is starting to play audio on a given speaker - it seems logical to have a user gesture check there as well Harald: the most common use case where a device is no longer available is when people unplug their earbuds … when they do and the app notices, will there be a user activation event available or not? Youenn: in that situation in Safari, setSinkId would be available given the devicechange event harald: so a devicechange event counts as user activation in this case? Youenn: that's how we implemented in Safari … and we propagate to the async enumerateDevices call as well since it's likely to be called afterwards harald: so harmess if it includes the devicechange event Youenn: that's why the spec should allow it; we think we've solved web compat concerns, we'll see if more emerge Fippo: a common use case is to play the ring tone on the speaker and the call on the headset - would that still be supported? Youenn: in a call, you call gUM, you get the devices and then call setSinkId - microphone starting also counts as activation in our heuristics … When we call gUM, the enumerateDevices list is changing with a devicechange event, which is used as the trigger to do the device setup Jan-Ivar: so user activation or devicechange event? Youenn: that's what we've implemented Jan-Ivar: how does this deal with multiple gUM? Youenn: this hasn't been a concern so far Jan-Ivar: I'm not sure Firefox would be able to implement this; I'm also not a big fan of SHOULD TimP: I'm supportive of this change, with a bit more work on details … as it can protect against misuses Dom: what would it take to turn this into a MUST? Youenn: I thought of first making this a first step and get implementation experience before requiring it Guido: I'm ok with adding it as a may, but making a must will require a more extensive list of circumstances Youenn: I can start with a MAY and open a separate issue to make it stronger RESOLUTION: Proceed with a MAY require user activation Should media capture output define an explicit default speaker device? [27]#151 [27] https://github.com/w3c/mediacapture-output/issues/151 [28][Slide 11] [28] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#11 [29][Slide 12] [29] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#12 Jan-Ivar: I agree the situation is unfortunate for Web compat; is this only for speakers, or also for microphone? Youenn: the problem seems less prominent for microphones Jan-Ivar: the problem with that approach is that it clashes with the rest of the spec in terms of the devicechange event Youenn: if you want the default, you call setSinkId("") Jan-Ivar: but this doesn't let detect when the default OS device changes Youenn: we've done that change for webcompat and seems to be working well Guido: the default for the output device is a good idea; I think Firefox does it in some circumstances with selectAudioOutput Jan-Ivar: yes, we have it in specific conditions for selectAudioOutput Youenn: default speaker is also a widely used concept across OS Jan-Ivar: Firefox needs to improve its compat on devicechange event in any case; so I'm in favor if we can clarify the situation with devicechange event Youenn: ready for PR then? Jan-Ivar: we can continue on the issue but not opposed to a PR RESOLUTION: proceed with proposed change with additional discussion expected Expose the type of device in MediaDeviceInfo [30]#1 [30] https://github.com/w3c/mediacapture-output/issues/1 [31][Slide 13] [31] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#13 Jan-Ivar: LGTM, doesn't seem to bring privacy issues since they're only exposed when the device is exposed … should there be a headset category? Youenn: we can try this Harald: I'm a bit worried about the specifics of the enumeration … e.g. many mics are usb even they're built-in Youenn: I can bring more info on what Windows / MacOS expose RESOLUTION: Proceed with a pull request with additional discussion on enumerated values Speaker devices may not always work with all microphones [32]#149 [32] https://github.com/w3c/mediacapture-output/issues/149 [33][Slide 14] [33] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#14 Guido: what would be the alternative to failing gUM/setSinkId? Youenn: I don't have a specific proposal; I was first trying to get a sense if that's a problem worth fixing (e.g. if it affects other OS) Guido: maybe let's focus first on identifying how widespread an issue it is Jan-Ivar: this reminds me of the issue where you can't open multiple mics on phones; don't have a good solution off the top of my head either [34]Decoder exposure and software fallback [34] https://github.com/w3c/webrtc-extensions/issues/146 [35][Slide 17] [35] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#17 [36][Slide 18] [36] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#18 [37][Slide 19] [37] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#19 [38][Slide 20] [38] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#20 [39][Slide 21] [39] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#21 Youenn: the Privacy Working Group had raised concerns - we should ask them if we're looking at this again … if media capabilities already provide that info through polling - is that good enough? … polling instead of an event might be a feature here Nishitha: media capabilities doesn't expose what is actually reflect what's happening during streaming Diego: mc signals the potential to have the hardware decode enabled, but it doesn't say if it is happening … e.g. we can't get info on situations where MC says there is hardware decode but we're not seeing it used … there are software-fallback situations where errors occur that are can't be monitored via telemetry Youenn: MC solves hardware support, but not lack of temporary availability. maybe it's a shortage of MC? Diego: given that streams get negotiated during SDP O/A with the codec profile and format and characteristics that can lead to a decoder giving up in the middle of the stream; MC require predicting all possible cases to detect these situations when this event could give much more specific direction Youenn: if the PRivacy Working Group is fine with this, I'm fine too; but these issues might arise in WebCodecs as well, so having a single solution would be nice Jan-Ivar: I understand the event proposal comes from feedback from the Privacy WG (vs stats) … It's not really clear what fallback means … e.g. would that event be fired in a system without hardware decode support? … there are also situations (e.g. small frames) where software decode wouldn't be a sign of a problem … in terms of API shape, I prefer B rather than A that makes it harder to distinguish fatal errors … supportive of direction but with more clarification on situation of failures TimP: supportive of this, but less supportive of the "fallback" concept and the hardware/software dichotomy … I think the event we want is "decoder implementation changed" … I think what we really care about is latency, not whether it's hardware or software … it would be nice to have stats on average frame decode time if we don't have one Fippo: we do TimP: then trigger on implementation change + stats would work Diego: detecting device type is really hard for (good) privacy protection reasons, so we can't really figure the characteristics of the devices on which the stream is running, in particular to detect regressions TimP: but knowing the decoder has changed under your feet, would that help? Diego: CPU decode is not only about latency: it has impact on battery and thermal impact … the event would be useful, but less useful dom: if we want to do this, we should do this and get feedback from Privacy WG … events can add privacy attacks by surfacing on two different origins Harald: why on Transceiver vs Receiver? Fippo: +1 to Receiver RESOLUTION: discuss proposal in more depth and prepare for Privacy review [40]generateKeyFrame() API consolidation (Jan-Ivar) [40] https://github.com/w3c/webrtc-encoded-transform/ Issue [41]#273 / PR [42]#274: Remove sender.generateKeyFrame() [41] https://github.com/w3c/webrtc-encoded-transform/issues/273 [42] https://github.com/w3c/webrtc-encoded-transform/issues/274 [43][Slide 25] [43] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#25 TimP: I don't like the first API - encoding parameters should be less dynamic than that, this is not an encoding parameter; the second API makes much more sense Jan-Ivar: the argument why we went for this API is that it allows to combine changing all parameters and sending a keyframe at the same time RESOLUTION: Proceed with removing unimplemented API Issue [44]#147: expose rid as metadata on outgoing frames [44] https://github.com/w3c/webrtc-encoded-transform/issues/147 [45][Slide 26] [45] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#26 Fippo: we would like the encoding index in addition of the rid; we would like the mid since it isn't available in workers Jan-Ivar: the mid can be passed as an option Youenn: or in the Transformer itself … there will be one per mid … not exposing it in frames makes it more lightweight … we should file an issue on this Jan-Ivar: adding an encodingIndex can also be filed an issue Fippo: having it in addition to rid is an ergonomy value RESOLUTION: Proceed with pull request PR [46]#276: Default the generate key frame algorithm to all layers [46] https://github.com/w3c/webrtc-encoded-transform/issues/276 [47][Slide 27] [47] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#27 Youenn: the main use case is changing the encryption key in which case you want to generate keyframes for all layers RESOLUTION: proceed with merging PR Issue 143: should transform.generateKeyFrame() take an array of rids? [48][Slide 28] [48] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#28 [49][Slide 29] [49] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#29 [50][Slide 30] [50] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#30 Youenn: no strong preference, but a slight preference to keep it as is since it matches the design requirements (encryption, per-rid keyframe); not sure what the use cases would be for different subsets; it adds complexity (e.g. what happens if one is invalid) Fippo: there are use cases which only require 2 layers, e.g. on this call Youenn: but this a use case for Transformer - we have setParameters otherwise Fippo: what would be return value? Originally, it returned a timestamp which wouldn't work for an array Jan-Ivar: already changed to undefined Dom: let's leave it as is; we can change it to DOMString or Array if there is an important use case RESOLUTION: Leave current API with single DOMString argument RTCDataChannel (SDP and stats) [51]Always negotiate datachannels [51] https://github.com/w3c/webrtc-extensions/issues/241 [52][Slide 33] [52] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#33 Jan-Ivar: the problem is that BUNDLE attaches to the first m-line by default? Fippo: right; since datachannels can't be rejected, they're the right target for BUNDLE Jan-Ivar: thanks, makes sense to me Youenn: +1 … would be good to look into a JSEP revision given this is a second item on the revision list Fippo: I have a bunch of issues against JSEP, I can talk with Justin on a third revision Jan-Ivar: that'd be great … are there any concerns on compat issues? Fippo: sounds unlikely Dom: let's file a JSEP issue at the same time RESOLUTION: Proceed with PR and JSEP issue [53]What is the lifetime of stats? [53] https://github.com/w3c/webrtc-stats/issues/805 [54][Slide 34] [54] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#34 [55][Slide 35] [55] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#35 Youenn: will changes there create web compat issues? hopefully there is still room for making the right decisions Fippo: I have web compat concerns for inboundrtp, we'll see Jan-Ivar: thanks a lot for the analysis, showing diversity across stats, implementations (not clear what the spec asks for) … when implementations have a shared behavior, that's hopefully a good direction to go … +1 to documenting and cleaning as much as web compat enables Fippo: the problem is creating stat objects before the relevant object is indeed created … Documenting rules and their motivation would be good, before seeing what we can change Jan-Ivar: another parameter to take into account is rollback Harald: there was a specific situation with candidate pair that some pairs contain ip addresses that are considered sensitive … I don't know if that impacts on when they're exposed Fippo: they're hidden, so it shouldn't matter Harald: the number of outgoing datachannels you create shoudl be equal to the number reflected in stats; I would be in favor to have them show up early Fippo: let's document the behavior and then disagree on the right one :) TimP: I'm happy with it being early; it's unpleasant but necessary Jan-Ivar: having it late mean having it more useful; more generally, for early stats, we should be clear on what data they expose Fippo: will report on this at the next meeting [56]data channel ids set before SCTP init #3071 [56] https://github.com/w3c/webrtc-pc/issues/3071 [57][Slide 36] [57] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#36 TimP: in theory, you could request more datachannels than the other party would accept Fippo: good point, we need to look into that Jan-Ivar: another aspect is workers … if all browsers do the same thing, documenting it sounds good to me RESOLUTION: Proceed with a PR [58]Bring Your Own Degradation Adaptation [58] https://github.com/w3c/mst-content-hint/issues/62 [59][Slide 39] [59] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#39 [60][Slide 40] [60] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#40 [61][Slide 41] [61] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#41 Jan-Ivar: SGTM Fippo: +1 … would it make sense to expose the QP on the insertable stream as well? (rather than using getStats) Sergey: exposing QP per frame sounds good Guido: please file an issue TimP: also in favor; as Fippo said, there may be more data needed outside of stats Fippo: maintain-framerate / maintain-resolution - this adds the 3rd point of the triangle (with balance in the center) Guido: there is an existing PR where the conversation can continue RESOLUTION: proceed with PR [62]SFrameEncrypterStream rename [62] https://github.com/w3c/webrtc-encoded-transform/issues/262 [63][Slide 44] [63] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#44 Youenn: the behavior is not undefined … that isn't to say having different objects would be useful - e.g. for decrypting/encrypting … initially, one object for everything was sufficient Harald: initially, we thought SFrameTransform would be added as a first or last step in a chain of transforms … if we're abandoning that model (which I think we should since nobody has implemented), we should look at how to apply it to a sender/receiver [64][Slide 45] [64] https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/#45 Youenn: +1 … we will be able to add management key APIs dedicated to decryption and encryption … we might be able to duplicate these APIs in SFrameTransform as well … letting the UA do the encryption sounds like a good thing in general Harald: I would like to see an example of ScriptTransform and SFrameTransform together Jan-Ivar: see slide Harald: that makes sense Jan-Ivar: that wouldn't work for SPacket though Harald: SGTM RESOLUTION: Proceed with PR Summary of resolutions 1. [65]Proceed with a MAY require user activation 2. [66]proceed with proposed change with additional discussion expected 3. [67]Proceed with a pull request with additional discussion on enumerated values 4. [68]discuss proposal in more depth and prepare for Privacy review 5. [69]Proceed with removing unimplemented API 6. [70]Proceed with pull request 7. [71]proceed with merging PR 8. [72]Leave current API with single DOMString argument 9. [73]Proceed with PR and JSEP issue 10. [74]Proceed with a PR 11. [75]proceed with PR 12. [76]Proceed with PR
Received on Thursday, 18 September 2025 12:13:50 UTC