Re: [minutes] April 26 VI from Harald Alvestrand on 2022-05-02 (public-webrtc@w3.org from May 2022)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Mon, 2 May 2022 16:55:39 +0200
To: public-webrtc@w3.org
Message-ID: <0b6e3ccc-de68-9b49-6aaf-70a0b6736fba@alvestrand.no>
And the video recording is up at

https://www.youtube.com/watch?v=qSlXLqouxCs

On 5/2/22 15:19, Dominique Hazael-Massieux wrote:
> Hi,
> 
> The minutes of our call on April 26 are available at:
>    https://www.w3.org/2022/04/26-webrtc-minutes.html (with youtube 
> recording)
> 
> They're also copied as text below.
> 
> Dom
> 
>                            WebRTC April 2022 VI
> 
> 26 April 2022
> 
>     [2]Agenda. [3]IRC log.
> 
>        [2] https://www.w3.org/2011/04/webrtc/wiki/April_26_2022
>        [3] https://www.w3.org/2022/04/26-webrtc-irc
> 
> Attendees
> 
>     Present
>            Anssi, Carine, ChrisCunningham, Dom, Elad, Florent,
>            Harald, Jan-Ivar, MichaelSeydl, Ningxin,
>            PatrickRockhill, PhilippHancke, Sergio, TimP, Youenn
> 
>     Regrets
>            -
> 
>     Chair
>            Bernard, Harald, Jan-Ivar
> 
>     Scribe
>            dom, youenn
> 
> Contents
> 
>      1. [4]WebNN Integration with real-time video processing
>      2. [5]WebRTC Extensions
>           1. [6]Issue [7]#95
>           2. [8]Issue [9]#100
>      3. [10]https://github.com/w3c/mediacapture-extensions/issues/
>         47 Voice Isolation Constraint
>      4. [11]support for contentHint in Capture Handle
>      5. [12]WebRTC Extensions
>      6. [13]Avoid user-confusion by avoiding offering undesired
>         audio sources
>      7. [14]Region Capture
>      8. [15]Summary of resolutions
> 
>        [7] https://github.com/w3c/mediacapture-region/issues/95
>        [9] https://github.com/w3c/mediacapture-region/issues/100
> 
> Meeting minutes
> 
>     Recording: [16]https://www.youtube.com/watch?v=qSlXLqouxCs
> 
>       [16] https://www.youtube.com/watch?v=qSlXLqouxCs
> 
>     IFRAME:
>     [17]https://www.youtube.com/embed/qSlXLqouxCs?enablejsapi=1&rel
>     =0&modestbranding=1
> 
>       [17] 
> https://www.youtube.com/embed/qSlXLqouxCs?enablejsapi=1&rel=0&modestbranding=1 
> 
> 
>     Slideset: [18]https://lists.w3.org/Archives/Public/www-archive/
>     2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf
> 
>       [18] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf 
> 
> 
>    [19]WebNN Integration with real-time video processing [20]🎞︎
> 
>       [19] https://github.com/webmachinelearning/webnn/issues/226
>       [20] https://www.youtube.com/watch?v=qSlXLqouxCs#t=167
> 
>     [21][Slide 10]
> 
>       [21] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=10 
> 
> 
>     [22][Slide 11]
> 
>       [22] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=11 
> 
> 
>     [23][Slide 12]
> 
>       [23] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=12 
> 
> 
>     [24][Slide 12]
> 
>       [24] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=12 
> 
> 
>     [25][Slide 13]
> 
>       [25] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=13 
> 
> 
>     [26][Slide 15]
> 
>       [26] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=15 
> 
> 
>     ningxin: slide 15 is high level pipeline to build a background
>     blur video pipeline
> 
>     Two implementations: WebGL and WebGPU/WebNN.
> 
>     texture uploads to GPU in both cases.
> 
>     Last shader is taking 3 input: original image, blurred image,
>     and computed segmentation map.
> 
>     Description of the perf issues, in particular CPU usage and GC.
> 
>     Bernard: is there a copy on the output at offscreencanvas
>     level?
> 
>     Ningxin: not sure.
> 
>     Tim: is the perf acceptable? or do we need to make massive
>     improvements?
> 
>     ningxin: we need to measure battery impact
> 
>     dom: we are doing this prototype to evaluate what HW
>     acceleration can bring us. And identify potential roadblocks
>     when trying to do video processing on media capture
> 
>     for instance color conversion or pixel format.
> 
>     youenn: looking at 20% CPU on GC - can that be fixed by
>     implementations, or is it an architectural issue with having
>     lots of objects created per frame?
>     … on native, there is usually a buffer pool to help with that
>     … does that need to be surfaced to the JS, or can that be dealt
>     solely by the UA?
> 
>     ningxin: GPUBuffers are created beforehand. Some objects are
>     created for every frame, like textures.
> 
>     There are ways to avoid many object allocations.
> 
>     at JS level. Maybe UA optimisations might help.
> 
>     dom: what are the next steps for this project?
> 
>     ningxin: 1. enable WebGPU backend.
> 
>     2. new APIs that allow import frames as GPU textures and see
>     whether that will improve efficiency.
> 
>     3. Improve VideoFrame GC PR: we will try out when it is merged
>     in Chrome.
> 
>     youenn: re CPU efficiency - this is moving between main thread
>     and worker thread, that may have a small perf impact
>     … doing everything in the worker might be helpful once that's
>     possible
> 
>    [27]WebRTC Extensions [28]🎞︎
> 
>       [27] https://github.com/w3c/webrtc-extensions
>       [28] https://www.youtube.com/watch?v=qSlXLqouxCs#t=1760
> 
>     [29][Slide 19]
> 
>       [29] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=19 
> 
> 
>      Issue [30]#95 [31]🎞︎
> 
>       [30] https://github.com/w3c/webrtc-extensions/issues/95
>       [31] https://www.youtube.com/watch?v=qSlXLqouxCs#t=1780
> 
>     [32]Media Capabilities issue 185
> 
>       [32] https://github.com/w3c/media-capabilities/issues/185
> 
>     [33][Slide 20]
> 
>       [33] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=20 
> 
> 
>     [34][Slide 21]
> 
>       [34] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=21 
> 
> 
>     [35][Slide 22]
> 
>       [35] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=22 
> 
> 
>     [36][Slide 23]
> 
>       [36] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=23 
> 
> 
>     [37][Slide 24]
> 
>       [37] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=24 
> 
> 
>     Bernard: question to WG is: is it a goal for MC to deprecate
>     getCapabilities?
> 
>     youenn: my understanding is that media capabilities is really
>     about audio/video capabilities
>     … so it doesn't make sense to expose e.g. CN there
>     … they should stay in WebRTC getCapabilities
>     … getCapabilities() being sync is problematic; that's less of
>     an issue for software capabilities such as CN
>     … so deprecated getCapabilities fully is not a goal, but
>     partially, yes
> 
>     Florent: +1 on the approach usability of resulting split is a
>     concern
> 
>     chris: seems fine to use that split; do we want to return rtc
>     codec info from media capabilities?
>     … if so, please take at look at [38]https://github.com/w3c/
>     media-capabilities/issues/185
> 
>       [38] https://github.com/w3c/media-capabilities/issues/185
> 
>     youenn: +1 on disambiguating the outcome of this situation
>     … listing all codecs in just one call is a non goal
>     … an SFU is typically only interested in a few codecs
>     … for P2P, setCodecPreferences is probably not needed in the
>     first place - you can deal with a generic codec negotiation
> 
>     jib: would be good to clarify if we want to deprecate "real"
>     codecs from getCapabilities? this sounds like a good long term
>     goal for me
> 
>     harald: I worry that RTX/RED/FEC info needs to be available
>     somewhere
>     … getCapabilities has known problems and would be the only way
>     to get it
>     … changing getCapabilities is actually harder to deprecating it
>     … in the long run, it's best to deprecated getCapabilities and
>     replace it with a better dedicated API
> 
>     Florent: two different scenarios for setCodecPreferences:
>     talking with an SFU in which case you can make specific codec
>     queries; in a P2P scenario, if you can't enumerate all the
>     codecs, you won't be able to call setcodecpreferences
>     … this would require hardcoding a list of codecs
>     … is there a way to make getCapabilities evolve in a shape that
>     would satisfy everyone?
>     … getCapabilities+setCodecPreferences has a lot of current
>     usage, will be hard to deprecated
> 
>      Issue [39]#100 [40]🎞︎
> 
>       [39] https://github.com/w3c/webrtc-extensions/issues/100
>       [40] https://www.youtube.com/watch?v=qSlXLqouxCs#t=2747
> 
>     [41][Slide 25]
> 
>       [41] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=25 
> 
> 
>     [42][Slide 27]
> 
>       [42] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=27 
> 
> 
>     youenn: might be fine, but I worry about the defaults? would
>     they be the same across browsers?
>     … there are current codecs that are defaults, but that may need
>     to evolve over time
>     … this could create Web compat issues
> 
>     Sergio: some of the codecs are receive-only
>     … the list would be based on common sense, but I don't have a
>     strong opinion
> 
>     youenn: my worry is about P2P - if the defaults aren't same
>     across UAs, the negotiation will fail
> 
>     sergio: my suggestion was to use defaults in the offer, and
>     adapt the answer based on the offer
> 
>     harald: two interfaces needed: the list of codecs currently
>     willing to offer, the set of codecs you can offer
>     … the 1st one might be getCapabilities, the proposal on the
>     slide for the 2nd
>     … in terms of interop, MTI codecs should be the safety net, and
>     they should be in the mandatory-to-offer
> 
>     florent: the proposal seems ot have a lot of overlap with
>     setCodecPreferences / getParameters - could we improve these
>     instead of coming up with new API
> 
>     [Philipp supports this on the chat]
> 
>     Sergio: would be fine; I started from the rtp header
>     extensions, maybe that should apply there?
> 
>     florent: the difference is that there is already an API to set
>     codec preferences
> 
>     sergio: but header extensions could be added there too?
> 
>     Bernard: let's continue the discussion in the issue
>     … or work on a matching PR
> 
>    [43]https://github.com/w3c/mediacapture-extensions/issues/47 Voice
>    Isolation Constraint [44]🎞︎
> 
>       [43] https://github.com/w3c/mediacapture-extensions/issues/47
>       [44] https://www.youtube.com/watch?v=qSlXLqouxCs#t=3316
> 
>     [45][Slide 41]
> 
>       [45] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=41 
> 
> 
>     [46][Slide 42]
> 
>       [46] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=42 
> 
> 
>     Resolution for issue 95: mark issue as ready for PR
> 
>     [47][Slide 43]
> 
>       [47] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=43 
> 
> 
>     [48][Slide 44]
> 
>       [48] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=44 
> 
> 
>     [49][Slide 45]
> 
>       [49] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=45 
> 
> 
>     youenn: it makes sense; reasonable to ignore `noiseSuppression`
>     … there is also `echoCancellation` in the audio pipeline
>     … does it make sense to do `echoCancellation` when this is set?
> 
>     harald: I think it's mostly orthogonal
> 
>     youenn: so `echoCancelation: false` is compatible with
>     `voiceIsolation: true`
>     … it may be challenging for some implementations to support
>     these combinations
> 
>     jan-ivar: I like this too; what should the default be? that may
>     bring concerns
> 
>     harald: we can discuss this in the PR
>     … conservatively, the default should be the current behavior
>     (false)
> 
>     dom: instead of boolean, we could use strings for extra
>     flexibility.
> 
>     RESOLUTION: mark voiceIsolation [50]issue #47 as ready for PR
> 
>       [50] https://github.com/w3c/mediacapture-extensions/issues/47
> 
>    [51]support for contentHint in Capture Handle [52]🎞︎
> 
>       [51] https://github.com/w3c/mediacapture-handle/issues/35
>       [52] https://www.youtube.com/watch?v=qSlXLqouxCs#t=3867
> 
>     [53][Slide 48]
> 
>       [53] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=48 
> 
> 
>     [54][Slide 49]
> 
>       [54] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=49 
> 
> 
>     [55][Slide 50]
> 
>       [55] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=50 
> 
> 
>     [56][Slide 51]
> 
>       [56] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=51 
> 
> 
>     youenn: setting the track hint is unnecessary - if the capturer
>     is setting the hint on its side, the UA knows that the track
>     being captured is text - there is no need to transmit it to the
>     capturer
>     … except maybe if WebCodecs is the picture
>     … having the UA taking care of this seems preferable
> 
>     elad: the suggestion would be that the captured content
>     self-declare its type and the UA uses it?
>     … but that removes the liberty of the capturer to decide
>     whether to use the hint or not
>     … which could be based on e.g an allowlist
>     … autodetection by the UA would have its own limitation
> 
>     bernard: re the WebCodecs case - contentHint is not
>     automatically consumed by WebCodecs, it's up to the app to
>     apply it as codec setting
> 
>     jib: I agree with youenn that the UA is in good place to
>     shortcircuit the capturer part
>     … the proposal could be useful for the capturee side
>     … exposing further metadata to the controller might be an
>     interesting addition to my capturecontroller proposal
> 
>     youenn: it could be exposed at the videoframe level
> 
>     jib: I see agreement on the need, not yet on the API shape
> 
>    [57]WebRTC Extensions [58]🎞︎
> 
>       [57] https://github.com/w3c/webrtc-extensions
>       [58] https://www.youtube.com/watch?v=qSlXLqouxCs#t=4571
> 
>    [59]Avoid user-confusion by avoiding offering undesired audio sources
>    [60]🎞︎
> 
>       [59] https://github.com/w3c/mediacapture-screen-share/issues/219
>       [60] https://www.youtube.com/watch?v=qSlXLqouxCs#t=4583
> 
>     [61][Slide 54]
> 
>       [61] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=54 
> 
> 
>     [62][Slide 55]
> 
>       [62] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=55 
> 
> 
>     [63][Slide 56]
> 
>       [63] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=56 
> 
> 
>     Tim: is this only applicable for echo management?
> 
>     elad: it could be that an application is interested in
>     recording a specific tab, no more than that.
> 
>     Tim: this use case does not seem address: identifying the
>     desired tab would be needed.
> 
>     Elad: some VC applications usually do not want to capture
>     system audio.
> 
>     Jan Ivar: supportive, how about reusing displaySurface
>     constraint here?
> 
>     Elad: Might work for me.
> 
>     Jan Ivar: I would like to remove monitor from here.
> 
>     dom: if we do not include monitor here, audio: true might
>     capture system audio. But applications would not be able to
>     explicitly ask for system audio.
> 
>     dom: displaySurface would be a strange name for audio.
> 
>     youenn: let's enumerate the different approaches:
>     avoidSystemAudio, displaySurface, sources
> 
>     youenn: scope is unclear, we need to clarify this before going
>     to PR.
> 
>     youenn: different properties allow to do feature detection on
>     what kind of recording the UA can do
> 
>     elad: my focus is only limiting access to system audio, but I
>     also think flexibility is helpful
> 
>     timp: back to my echocancellation point - the constraint could
>     be linked to whether the source can be echocancelled
> 
>     Harald: source being echo cancellable is a second concern.
>     Biggest point is avoiding system audio.
> 
>     Tim: as well as window audio.
> 
>     harald: echoCancellation is a secondary concern - capturing
>     system audio could disclose info from a 3rd party
> 
>    [64]Region Capture [65]🎞︎
> 
>       [64] https://github.com/w3c/mediacapture-region
>       [65] https://www.youtube.com/watch?v=qSlXLqouxCs#t=5905
> 
>     RESOLUTION: continue discussions on GitHub.
> 
>     [66][Slide 59]
> 
>       [66] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=59 
> 
> 
>     youenn: [67]#11 is an issue on the shape of the CropTarget API
>     … given current chrome implementation work, feels it's useful
>     to converge on the API shape
> 
>       [67] https://github.com/w3c/mediacapture-region/issues/11
> 
>     [68][Slide 60]
> 
>       [68] 
> https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=60 
> 
> 
>     youenn: do we want to attach the API to element or to
>     MediaDevices?
>     … element feels like a better path
> 
>     jib: +1
> 
>     elad: I prefer mediaDevices given its linkage to screen capture
> 
>     youenn: cropTarget is linked to MediaStreamTrack, not
>     mediaDevices
>     … and it's really tied to an element
> 
>     elad: it can be used through an object you get from
>     getDisplayMedia
> 
>     youenn: but with a detached mediaDevices, you can't reject the
>     promise
> 
>     dom: prefer element option.
> 
>     youenn: next question is attribute vs method
>     … slight pref for attribute, but no strong feeling
> 
>     elad: there is a cost to minting a crop target - we mark the
>     element in the rendering pipeline in specific ways that we
>     shouldn't abuse
> 
>     youenn: I thought you were going to use a lazy approach to
>     reduce that cost
> 
>     elad: lazy tagging might help, but this needs more thinking
> 
>     jib: +1 to attribute
>     … developers value trump implementators value
> 
>     elad: I don't think it matters much to developers in the first
>     place
> 
>     harald: disagree with messing with the element interface, and
>     on hiding the fact that the operation has a cost
>     … also async (promises) may be needed for some implementations
>     … let's not hide the reality of the situation
> 
>     jib: the cost seems to be Chrome-specific
>     … the real goal of this API is a transferable reference
> 
>     youenn: +1
>     … other APIs in the past have re-used the element interface,
>     have made similar decisions on methods / attributes, async vs
>     sync
>     … we should follow existing implemented patterns
> 
>     dom: is there any other API that may be use this tranferable
>     reference?
> 
>     youenn: that's something I bring up in the issue
> 
>     elad: this may create unsafe usage for this well-defined target
> 
>     jan-ivar: this could be evaluated
> 
>     hta: but this shouldn't block progress on the specific narrow
>     goal we have
> 
>     youenn: my focus is aligning with current API patterns for this
>     API
> 
>     elad: the TAG will chime in; but if they don't give a clear
>     specific suggestion
>     … we could move with the current design that can be polyfilled
> 
>
Received on Monday, 2 May 2022 14:55:55 UTC