- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Mon, 2 May 2022 15:19:24 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi, The minutes of our call on April 26 are available at: https://www.w3.org/2022/04/26-webrtc-minutes.html (with youtube recording) They're also copied as text below. Dom WebRTC April 2022 VI 26 April 2022 [2]Agenda. [3]IRC log. [2] https://www.w3.org/2011/04/webrtc/wiki/April_26_2022 [3] https://www.w3.org/2022/04/26-webrtc-irc Attendees Present Anssi, Carine, ChrisCunningham, Dom, Elad, Florent, Harald, Jan-Ivar, MichaelSeydl, Ningxin, PatrickRockhill, PhilippHancke, Sergio, TimP, Youenn Regrets - Chair Bernard, Harald, Jan-Ivar Scribe dom, youenn Contents 1. [4]WebNN Integration with real-time video processing 2. [5]WebRTC Extensions 1. [6]Issue [7]#95 2. [8]Issue [9]#100 3. [10]https://github.com/w3c/mediacapture-extensions/issues/ 47 Voice Isolation Constraint 4. [11]support for contentHint in Capture Handle 5. [12]WebRTC Extensions 6. [13]Avoid user-confusion by avoiding offering undesired audio sources 7. [14]Region Capture 8. [15]Summary of resolutions [7] https://github.com/w3c/mediacapture-region/issues/95 [9] https://github.com/w3c/mediacapture-region/issues/100 Meeting minutes Recording: [16]https://www.youtube.com/watch?v=qSlXLqouxCs [16] https://www.youtube.com/watch?v=qSlXLqouxCs IFRAME: [17]https://www.youtube.com/embed/qSlXLqouxCs?enablejsapi=1&rel =0&modestbranding=1 [17] https://www.youtube.com/embed/qSlXLqouxCs?enablejsapi=1&rel=0&modestbranding=1 Slideset: [18]https://lists.w3.org/Archives/Public/www-archive/ 2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf [18] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf [19]WebNN Integration with real-time video processing [20]🎞︎ [19] https://github.com/webmachinelearning/webnn/issues/226 [20] https://www.youtube.com/watch?v=qSlXLqouxCs#t=167 [21][Slide 10] [21] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=10 [22][Slide 11] [22] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=11 [23][Slide 12] [23] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=12 [24][Slide 12] [24] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=12 [25][Slide 13] [25] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=13 [26][Slide 15] [26] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=15 ningxin: slide 15 is high level pipeline to build a background blur video pipeline Two implementations: WebGL and WebGPU/WebNN. texture uploads to GPU in both cases. Last shader is taking 3 input: original image, blurred image, and computed segmentation map. Description of the perf issues, in particular CPU usage and GC. Bernard: is there a copy on the output at offscreencanvas level? Ningxin: not sure. Tim: is the perf acceptable? or do we need to make massive improvements? ningxin: we need to measure battery impact dom: we are doing this prototype to evaluate what HW acceleration can bring us. And identify potential roadblocks when trying to do video processing on media capture for instance color conversion or pixel format. youenn: looking at 20% CPU on GC - can that be fixed by implementations, or is it an architectural issue with having lots of objects created per frame? … on native, there is usually a buffer pool to help with that … does that need to be surfaced to the JS, or can that be dealt solely by the UA? ningxin: GPUBuffers are created beforehand. Some objects are created for every frame, like textures. There are ways to avoid many object allocations. at JS level. Maybe UA optimisations might help. dom: what are the next steps for this project? ningxin: 1. enable WebGPU backend. 2. new APIs that allow import frames as GPU textures and see whether that will improve efficiency. 3. Improve VideoFrame GC PR: we will try out when it is merged in Chrome. youenn: re CPU efficiency - this is moving between main thread and worker thread, that may have a small perf impact … doing everything in the worker might be helpful once that's possible [27]WebRTC Extensions [28]🎞︎ [27] https://github.com/w3c/webrtc-extensions [28] https://www.youtube.com/watch?v=qSlXLqouxCs#t=1760 [29][Slide 19] [29] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=19 Issue [30]#95 [31]🎞︎ [30] https://github.com/w3c/webrtc-extensions/issues/95 [31] https://www.youtube.com/watch?v=qSlXLqouxCs#t=1780 [32]Media Capabilities issue 185 [32] https://github.com/w3c/media-capabilities/issues/185 [33][Slide 20] [33] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=20 [34][Slide 21] [34] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=21 [35][Slide 22] [35] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=22 [36][Slide 23] [36] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=23 [37][Slide 24] [37] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=24 Bernard: question to WG is: is it a goal for MC to deprecate getCapabilities? youenn: my understanding is that media capabilities is really about audio/video capabilities … so it doesn't make sense to expose e.g. CN there … they should stay in WebRTC getCapabilities … getCapabilities() being sync is problematic; that's less of an issue for software capabilities such as CN … so deprecated getCapabilities fully is not a goal, but partially, yes Florent: +1 on the approach usability of resulting split is a concern chris: seems fine to use that split; do we want to return rtc codec info from media capabilities? … if so, please take at look at [38]https://github.com/w3c/ media-capabilities/issues/185 [38] https://github.com/w3c/media-capabilities/issues/185 youenn: +1 on disambiguating the outcome of this situation … listing all codecs in just one call is a non goal … an SFU is typically only interested in a few codecs … for P2P, setCodecPreferences is probably not needed in the first place - you can deal with a generic codec negotiation jib: would be good to clarify if we want to deprecate "real" codecs from getCapabilities? this sounds like a good long term goal for me harald: I worry that RTX/RED/FEC info needs to be available somewhere … getCapabilities has known problems and would be the only way to get it … changing getCapabilities is actually harder to deprecating it … in the long run, it's best to deprecated getCapabilities and replace it with a better dedicated API Florent: two different scenarios for setCodecPreferences: talking with an SFU in which case you can make specific codec queries; in a P2P scenario, if you can't enumerate all the codecs, you won't be able to call setcodecpreferences … this would require hardcoding a list of codecs … is there a way to make getCapabilities evolve in a shape that would satisfy everyone? … getCapabilities+setCodecPreferences has a lot of current usage, will be hard to deprecated Issue [39]#100 [40]🎞︎ [39] https://github.com/w3c/webrtc-extensions/issues/100 [40] https://www.youtube.com/watch?v=qSlXLqouxCs#t=2747 [41][Slide 25] [41] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=25 [42][Slide 27] [42] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=27 youenn: might be fine, but I worry about the defaults? would they be the same across browsers? … there are current codecs that are defaults, but that may need to evolve over time … this could create Web compat issues Sergio: some of the codecs are receive-only … the list would be based on common sense, but I don't have a strong opinion youenn: my worry is about P2P - if the defaults aren't same across UAs, the negotiation will fail sergio: my suggestion was to use defaults in the offer, and adapt the answer based on the offer harald: two interfaces needed: the list of codecs currently willing to offer, the set of codecs you can offer … the 1st one might be getCapabilities, the proposal on the slide for the 2nd … in terms of interop, MTI codecs should be the safety net, and they should be in the mandatory-to-offer florent: the proposal seems ot have a lot of overlap with setCodecPreferences / getParameters - could we improve these instead of coming up with new API [Philipp supports this on the chat] Sergio: would be fine; I started from the rtp header extensions, maybe that should apply there? florent: the difference is that there is already an API to set codec preferences sergio: but header extensions could be added there too? Bernard: let's continue the discussion in the issue … or work on a matching PR [43]https://github.com/w3c/mediacapture-extensions/issues/47 Voice Isolation Constraint [44]🎞︎ [43] https://github.com/w3c/mediacapture-extensions/issues/47 [44] https://www.youtube.com/watch?v=qSlXLqouxCs#t=3316 [45][Slide 41] [45] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=41 [46][Slide 42] [46] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=42 Resolution for issue 95: mark issue as ready for PR [47][Slide 43] [47] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=43 [48][Slide 44] [48] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=44 [49][Slide 45] [49] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=45 youenn: it makes sense; reasonable to ignore `noiseSuppression` … there is also `echoCancellation` in the audio pipeline … does it make sense to do `echoCancellation` when this is set? harald: I think it's mostly orthogonal youenn: so `echoCancelation: false` is compatible with `voiceIsolation: true` … it may be challenging for some implementations to support these combinations jan-ivar: I like this too; what should the default be? that may bring concerns harald: we can discuss this in the PR … conservatively, the default should be the current behavior (false) dom: instead of boolean, we could use strings for extra flexibility. RESOLUTION: mark voiceIsolation [50]issue #47 as ready for PR [50] https://github.com/w3c/mediacapture-extensions/issues/47 [51]support for contentHint in Capture Handle [52]🎞︎ [51] https://github.com/w3c/mediacapture-handle/issues/35 [52] https://www.youtube.com/watch?v=qSlXLqouxCs#t=3867 [53][Slide 48] [53] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=48 [54][Slide 49] [54] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=49 [55][Slide 50] [55] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=50 [56][Slide 51] [56] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=51 youenn: setting the track hint is unnecessary - if the capturer is setting the hint on its side, the UA knows that the track being captured is text - there is no need to transmit it to the capturer … except maybe if WebCodecs is the picture … having the UA taking care of this seems preferable elad: the suggestion would be that the captured content self-declare its type and the UA uses it? … but that removes the liberty of the capturer to decide whether to use the hint or not … which could be based on e.g an allowlist … autodetection by the UA would have its own limitation bernard: re the WebCodecs case - contentHint is not automatically consumed by WebCodecs, it's up to the app to apply it as codec setting jib: I agree with youenn that the UA is in good place to shortcircuit the capturer part … the proposal could be useful for the capturee side … exposing further metadata to the controller might be an interesting addition to my capturecontroller proposal youenn: it could be exposed at the videoframe level jib: I see agreement on the need, not yet on the API shape [57]WebRTC Extensions [58]🎞︎ [57] https://github.com/w3c/webrtc-extensions [58] https://www.youtube.com/watch?v=qSlXLqouxCs#t=4571 [59]Avoid user-confusion by avoiding offering undesired audio sources [60]🎞︎ [59] https://github.com/w3c/mediacapture-screen-share/issues/219 [60] https://www.youtube.com/watch?v=qSlXLqouxCs#t=4583 [61][Slide 54] [61] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=54 [62][Slide 55] [62] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=55 [63][Slide 56] [63] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=56 Tim: is this only applicable for echo management? elad: it could be that an application is interested in recording a specific tab, no more than that. Tim: this use case does not seem address: identifying the desired tab would be needed. Elad: some VC applications usually do not want to capture system audio. Jan Ivar: supportive, how about reusing displaySurface constraint here? Elad: Might work for me. Jan Ivar: I would like to remove monitor from here. dom: if we do not include monitor here, audio: true might capture system audio. But applications would not be able to explicitly ask for system audio. dom: displaySurface would be a strange name for audio. youenn: let's enumerate the different approaches: avoidSystemAudio, displaySurface, sources youenn: scope is unclear, we need to clarify this before going to PR. youenn: different properties allow to do feature detection on what kind of recording the UA can do elad: my focus is only limiting access to system audio, but I also think flexibility is helpful timp: back to my echocancellation point - the constraint could be linked to whether the source can be echocancelled Harald: source being echo cancellable is a second concern. Biggest point is avoiding system audio. Tim: as well as window audio. harald: echoCancellation is a secondary concern - capturing system audio could disclose info from a 3rd party [64]Region Capture [65]🎞︎ [64] https://github.com/w3c/mediacapture-region [65] https://www.youtube.com/watch?v=qSlXLqouxCs#t=5905 RESOLUTION: continue discussions on GitHub. [66][Slide 59] [66] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=59 youenn: [67]#11 is an issue on the shape of the CropTarget API … given current chrome implementation work, feels it's useful to converge on the API shape [67] https://github.com/w3c/mediacapture-region/issues/11 [68][Slide 60] [68] https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=60 youenn: do we want to attach the API to element or to MediaDevices? … element feels like a better path jib: +1 elad: I prefer mediaDevices given its linkage to screen capture youenn: cropTarget is linked to MediaStreamTrack, not mediaDevices … and it's really tied to an element elad: it can be used through an object you get from getDisplayMedia youenn: but with a detached mediaDevices, you can't reject the promise dom: prefer element option. youenn: next question is attribute vs method … slight pref for attribute, but no strong feeling elad: there is a cost to minting a crop target - we mark the element in the rendering pipeline in specific ways that we shouldn't abuse youenn: I thought you were going to use a lazy approach to reduce that cost elad: lazy tagging might help, but this needs more thinking jib: +1 to attribute … developers value trump implementators value elad: I don't think it matters much to developers in the first place harald: disagree with messing with the element interface, and on hiding the fact that the operation has a cost … also async (promises) may be needed for some implementations … let's not hide the reality of the situation jib: the cost seems to be Chrome-specific … the real goal of this API is a transferable reference youenn: +1 … other APIs in the past have re-used the element interface, have made similar decisions on methods / attributes, async vs sync … we should follow existing implemented patterns dom: is there any other API that may be use this tranferable reference? youenn: that's something I bring up in the issue elad: this may create unsafe usage for this well-defined target jan-ivar: this could be evaluated hta: but this shouldn't block progress on the specific narrow goal we have youenn: my focus is aligning with current API patterns for this API elad: the TAG will chime in; but if they don't give a clear specific suggestion … we could move with the current design that can be polyfilled
Received on Monday, 2 May 2022 13:19:28 UTC