- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Wed, 11 Dec 2024 08:52:52 +0100
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi, The minutes of our meeting held yesterday are available at: https://www.w3.org/2024/12/10-webrtc-minutes.html and copied as text below. Dom WebRTC December 2024 meeting 10 December 2024 [2]Agenda. [3]IRC log. [2] https://www.w3.org/2011/04/webrtc/wiki/December_10_2024 [3] https://www.w3.org/2024/12/10-webrtc-irc Attendees Present Bernard, Carine, Guido, Harald, Jan-Ivar, Peter, SunShin, TimP, Tove, Youenn Regrets - Chair Bernard, Guido, Jan-Ivar Scribe dom Contents 1. [4]Captured Surface Switching 1. [5]Auto-pause 2. [6][screen-share] Issue 308: Should screen capture tracks expose deviceId? 3. [7]Back to Auto-pause 4. [8]Cross-type surface switching 2. [9]Timing info for Encoded Frames 3. [10][webrtc] Issue 3014: Spec says to send black frames for ended tracks 4. [11][mediacapture-extensions] Reduce scope of MediaStreamTrack transfer to DedicatedWorker for now 5. [12]Issue 115: What is the expected timing of MSTP video frame enqueuing with other track events 6. [13][mediacapture-output] Issue 147: Implicit consent via getUserMedia should allow access to non-miked speakers 7. [14]Summary of resolutions Meeting minutes Slideset: [15]https://docs.google.com/presentation/d/ 1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/ edit#slide=id.g2bb12bc23cb_0_0 ([16]archived PDF copy) [15] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0 Captured Surface Switching [17][Slide 10] [17] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#10 [18]Auto-pause [18] https://github.com/w3c/mediacapture-screen-share-extensions/issues/15 [19][Slide 12] [19] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#12 [20][Slide 13] [20] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#13 [21][Slide 14] [21] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#14 [22][Slide 15] [22] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#15 [23][Slide 16] [23] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#16 [24][Slide 17] [24] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#17 [25][Slide 18] [25] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#18 [26][Slide 19] [26] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#19 Youenn: re slide 17, let's take an example of a Web app wanting to pause to ask the user to confirm its intent to share the content … you want to pause sending to remote parties, but want to keep the local view of the shared content … you need different behaviors for the two Tove: with the callback, you can do the same thing as with the event Youenn: but that shows that freezing the video would be hack-ish in both situations … in a PC, you can use setParameter or replaceTrack() to handle this … if you're in a call with a local preview with a track in the worker sent to the remote party … you want to stop sending the track in the worker … the guarantee you get from the event makes it a better API; you may not even need to pause the track … I prefer the event, would be OK with adding the callback although not sure about having it async, although maybe the cropTarget makes it compelling … but would want to understand it better … otherwise would prefer a sync callback … in terms of ordering, it would be better to first fire the event (ensuring the settings are updated) and then the callback can access accurate information Jan-Ivar: the UI use case can be solved with either API; the only need for async would be for replaceTrack or cropTo … configurationchange exists today, mostly to adapt consumer of a track to its properties, with throttling to avoid correlating across origins … proposal 1 feels simpler … also deviceId is not currently specified at the moment, so wouldn't be exposed via configurationchange at the moment … also the event may not be clear about their causes which might create unexpected bugs for developers who wouldn't handle all situations Tove: the configurationchange event is only for a few cases today Youenn: the event should be fired whenever the settings are changed without involvement from the application … e.g. when the user agent or the OS enables blur - the app should be able to keep track of that … this matches what happens when a surface changes (outside of the Web app control) … re fuzzing, I don't think it's useful in this context - we should remove the related note … when you're processing in the worker, having the callback in `window` make it painful since you need to `postMessage` back and forth … if we agree with `deviceId` and ordering, it seems cheap to support the `configurationchange` Tove: so you're supporting both approach? Youenn: I'm ok with it - but still unsure about the callback being async (need to discuss more the cropTarget use case) … I'm reluctant to adding a new state where frames aren't emitted outside of `track.enabled` - but would consider it if necessary Jan-Ivar: at the moment, screen capture tracks don't expose deviceIds … if we decide later today to expose it, this would require firing `configurationchange` … we're set to discuss it later on this agenda Youenn: `deviceId` would be a small addition that would help with detecting surface change - we will indeed discuss it later Tove: so if we conclude to adding it, we would go with both? Youenn: if we do need an async callback, having just the callback would be fine Jan-Ivar: I'm not fond of having both as they feel redundant Youenn: having both (if we go in the right order) might queue two or three tasks when dealing with multiple tracks, but the resulting delay shouldn't be meaningful … (this doesn't apply in the worker case since no synchronization is needed in that case) … is there agreement about firing the event before the callback or is that too early to decide? … today, you already get an event when switching from screen to window Tove: the spec isn't very clear on the circumstances of when the configurationchange event should fire Jan-Ivar: I think the algorithm refers explicitly capabilities and settings [27][screen-share] Issue 308: Should screen capture tracks expose deviceId? [27] https://github.com/w3c/mediacapture-screen-share/issues/308 [28][Slide 33] [28] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#33 Youenn: not huge use case, but low complexity and no specific drawback that I could identify jan-ivar: my main concern is that `deviceId` doesn't match how it is used in other specs Youenn: `deviceId` is identifying the source jan-ivar: the chapterizing use case would be better done with labels from my perspective; it makes assumptions about how one would go about this … I think that leaves configurationchange as the main use case Guido: the fact that `deviceId` doesn't change for a camera doesn't invalidate the idea of using for a possibly changing source … I like the idea of exposing it and signaling its change with configurationchange dom: exposing `deviceId` sounds reasonable to me to Jan-Ivar: ok, since I don't have a strong reason to object to it, I think we probably have consensus on adding it RESOLUTION: Consensus to add `deviceId` to settings of a track Youenn: I'll check how this impact to canvas sourced tracks (and how they behave today) [29]Back to Auto-pause [29] https://github.com/w3c/mediacapture-screen-share-extensions/issues/15 Tove: so Youenn you asked about the async callback use case with regard to cropTarget? Youenn: is the use case that you're cropTargetting in a VC, you switch surface and then you want to pause both local/remote tracks before deciding whether to re-crop? Tove: imagine switching to a tab with slides and speaker notes, you'll want to cropTo to resolve before sending frames with the additional content Youenn: in that use case, there is no local preview? if so, why not use setting `track.enabled` to false or stopping it on the PC? Tove: there may be use cases where this would also apply to a local track - keeping the two tracks in sync as part of the callback feels cleaner Jan-Ivar: I'm a fan of simplicity - if we have the configurationchange event, I would prefer to only have the event API, not also the callback Guido: I think we can start with `deviceId` and if additional use cases show that having both would be beneficial, we can revisit it RESOLUTION: proceed with configurationchange event as the way to manage surface switching handling [30]Cross-type surface switching [30] https://github.com/w3c/mediacapture-screen-share-extensions/issues/16 [31][Slide 20] [31] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#20 Tove: this would be a hint more than a hardset requirement Jan-Ivar: what is the use case for not allowing surface switching, beyond backwards compat? Tove: it's only for backwards compat Jan-Ivar: if so, I would rather not add an API just for that Tove: if an app expects only a tab, the app may not expect they need to update their UI (e.g. cropTarget no longer is possible) Jan-Ivar: cropTo would fail, which is probably what you want Youenn: I would look at what OS do; in macOS, it's the OS controlling the UI to pick surfaces (not the UA) … I haven't checked if there is an option to do that on macOS - if it's not doable at the OS level, then it won't be implementable there, in which case I would be reluctant Tove: this is exposed on the macOS API Youenn: OK - if it's implementable, and if native apps find it useful, that suggests this is a useful API to consider … it would definitely need to be hint, and make it possible for the UA/user to override … I would use include as the default Tove: I'm hearing stronger examples needed for "exclude" Jan-Ivar: if it's a hint, maybe backwards compat doesn't need to be standardized Harald: we had a similar situation with the plan B transition Harald: so the conclusion is that cross-type surface switching is always on? Youenn: I'll look into the use cases that supported the macOS API and whether they justify a Web API [32]Timing info for Encoded Frames [32] https://github.com/w3c/webrtc-encoded-transform/issues/235 [33][Slide 23] [33] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#23 [34][Slide 24] [34] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#24 [`captureTimestamp` in the slide should be `captureTime`] [35][Slide 25] [35] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#25 Youenn: maybe we could use a single dictionary to have a single definition Guido: any opposition to add receiveTime to encoded frames? … receiveTime would be defined as the timestamp of the last packet necessary to build the video frame once received Jan-Ivar: are there cases where the said frame wouldn't be transmitted? … e.g. with WebCodecs + RTCEncodedSource? Guido: the time it reaches the receiver Harald: things that aren't received shouldn't have a receiveTime Bernard: this is a good proposal, and I like having it both in WebCodecs and in EncodedFrame Youenn: not sure I'm convinced with WebCodecs, but will discuss on github RESOLUTION: proceed with adding receiveTime to Encoded* Metadata [36][webrtc] Issue 3014: Spec says to send black frames for ended tracks [36] https://github.com/w3c/webrtc-pc/issues/3014 [37][Slide 29] [37] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#29 Youenn: I would align the spec with what implementations are doing; it may not be the best, but it's the current status and apps seem fine with it … it would be good to create consistency for the last point - maybe raise a separate issue RESOLUTION: Proceed with aligning with current implementations [38][mediacapture-extensions] Reduce scope of MediaStreamTrack transfer to DedicatedWorker for now [38] https://github.com/w3c/mediacapture-extensions/issues/158 [39][Slide 30] [39] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#30 Youenn: the VideoFrame object is only exposed in Window and DedicatedWorker, likewise for the MSTP … it makes sense to reduce this for DedicatedWorker … Safari is only implementing transfer to DedicatedWorker in the same cluster … (same for VideoFrame) … not sure if that needs to be spelt out Jan-Ivar: we could look into that and if there are precedents for restricting to a cluster RESOLUTION: Proceed with reducing scope of MediaStremaTrack to DedicatedWorker [40]Issue 115: What is the expected timing of MSTP video frame enqueuing with other track events [40] https://github.com/w3c/mediacapture-transform/issues/115 [41][Slide 31] [41] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#31 [42][Slide 32] [42] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#32 Bernard: regarding the last one, I wouldn't try synchronizing getSettings and videoframe properties Youenn: that would require exposing all the settings in the video frame Bernard: maybe not all - e.g. facingMode wouldn't be useful; but some we're already discussing exposing (e.g. blur, width and height) Jan-Ivar: 2 questions here: timing, and what to expose in videoframe metadata … focusing on timing for now … it's good to specify things, but I'm not sure how many synchronous guarantees we can provide given the buffering … coupling things too tightly might make it challenging to add transform … the buffer may lead us to handle different e.g. the mute / unmute event Youenn: I'm talking specifically about enqueuing frames, not reading frames from the stream … because we're enqueuing a task, this provides an opportunity to synchronize things … if we specify unmute, it seems logical to specify the mute situation as well, symetrically … I agree with Bernard's point about getSettings; I'm not sure how to handle applyConstraints or configurationChange Jan-Ivar: how would handle buffers, e.g. maxBufferSize = 10? Youenn: this shouldn't have any impact Guido: this should be dealt at the sink level (where buffering happens) … not sure it needs to have something specific for MSTP - it mostly deals with things outside the track Youenn: one model we could consider: mute/unmute events are always sent in the same thread; we enqueue tasks for the unmute event … for the rest, we consider sources single-threaded Harald: VideoFrames are observable on an HTML video element … if we get a mute event, and we get a frame that was sent after the mute event, that's clearly an error - we should prevent it … we can say something like the events have to act as if they were ordered, as if they came from the same event source Jan-Ivar: I would like to understand better as to why this is problem … focusing on the event might the wrong approach … e.g. muting happens in the background, and is only surfaced in JS later … you could probably write a WPT with VideoTrackGenerator … my only worry is to make things too synchronous … I'm not sure there is an overall solution, we should look at each case Youenn: agree about not dealing with getSettings and videoframes … I can try and detail more situations on github Jan-Ivar: shouldn't it apply to track.enabled as well? Youenn: I should look into it Bernard: re configurationchange and applyConstraints - are we clear on the effect of these events on videoframe properties? … otherwise, I agree with Jan-Ivar on the risk of being too prescriptive … we shouldn't make this based on the timing of the events, but instead base on videoframe properties Youenn: ok, so more discussion is needed on github [43][mediacapture-output] Issue 147: Implicit consent via getUserMedia should allow access to non-miked speakers [43] https://github.com/w3c/mediacapture-output/issues/147 [44][Slide 35] [44] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#35 [45][Slide 36] [45] https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#36 Jan-Ivar: my concern is that this might be blocking on [46]w3c/ mediacapture-main#1019 … not sure we should proceed with this before we get clarity on that one … if we maintain the existing spec requirement, this might make sense [46] https://github.com/w3c/mediacapture-main/issues/1019 Youenn: we could disentable the two by noting you need microphone access for exposing speakers (but we should make progress on the other one) Guido: [47]w3c/mediacapture-main#1019 is orthogonal to this one since we're already exposing speakers - the discussion is which speakers we expose … when is enumeration allowed is orthogonal [47] https://github.com/w3c/mediacapture-main/issues/1019 Youenn: does Chrome expose speakers without capture? Guido: it gates enumeration on permission Youenn: would you be able to align with the spec for speakers? Guido: the problem is Web compatibility … implementing the spec was not web compatible, so we had to rollback hence [48]w3c/mediacapture-main#1019 [48] https://github.com/w3c/mediacapture-main/issues/1019 Jan-Ivar: the 2 situations are connected since the decision on this may lead to different implementation across browsers Bernard: I agree with Guido they're orthogonal … would like to support exposing all speakers Jan-Ivar: I would object with proceeding before [49]w3c/ mediacapture-main#1019 is resolved [49] https://github.com/w3c/mediacapture-main/issues/1019 Youenn: let's try to discuss it at the next meeting then Summary of resolutions 1. [50]Consensus to add `deviceId` to settings of a track 2. [51]proceed with configurationchange event as the way to manage surface switching handling 3. [52]proceed with adding receiveTime to Encoded* Metadata 4. [53]Proceed with aligning with current implementations 5. [54]Proceed with reducing scope of MediaStremaTrack to DedicatedWorker
Received on Wednesday, 11 December 2024 07:52:54 UTC