[minutes] April 26 VI from Dominique Hazael-Massieux on 2022-05-02 (public-webrtc@w3.org from May 2022)

From: Dominique Hazael-Massieux <dom@w3.org>
Date: Mon, 2 May 2022 15:19:24 +0200
To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <1a92e503-02af-6e2f-5f0a-0802a32adf96@w3.org>
Hi,

The minutes of our call on April 26 are available at:
   https://www.w3.org/2022/04/26-webrtc-minutes.html (with youtube 
recording)

They're also copied as text below.

Dom

                           WebRTC April 2022 VI

26 April 2022

    [2]Agenda. [3]IRC log.

       [2] https://www.w3.org/2011/04/webrtc/wiki/April_26_2022
       [3] https://www.w3.org/2022/04/26-webrtc-irc

Attendees

    Present
           Anssi, Carine, ChrisCunningham, Dom, Elad, Florent,
           Harald, Jan-Ivar, MichaelSeydl, Ningxin,
           PatrickRockhill, PhilippHancke, Sergio, TimP, Youenn

    Regrets
           -

    Chair
           Bernard, Harald, Jan-Ivar

    Scribe
           dom, youenn

Contents

     1. [4]WebNN Integration with real-time video processing
     2. [5]WebRTC Extensions
          1. [6]Issue [7]#95
          2. [8]Issue [9]#100
     3. [10]https://github.com/w3c/mediacapture-extensions/issues/
        47 Voice Isolation Constraint
     4. [11]support for contentHint in Capture Handle
     5. [12]WebRTC Extensions
     6. [13]Avoid user-confusion by avoiding offering undesired
        audio sources
     7. [14]Region Capture
     8. [15]Summary of resolutions

       [7] https://github.com/w3c/mediacapture-region/issues/95
       [9] https://github.com/w3c/mediacapture-region/issues/100

Meeting minutes

    Recording: [16]https://www.youtube.com/watch?v=qSlXLqouxCs

      [16] https://www.youtube.com/watch?v=qSlXLqouxCs

    IFRAME:
    [17]https://www.youtube.com/embed/qSlXLqouxCs?enablejsapi=1&rel
    =0&modestbranding=1

      [17] 
https://www.youtube.com/embed/qSlXLqouxCs?enablejsapi=1&rel=0&modestbranding=1

    Slideset: [18]https://lists.w3.org/Archives/Public/www-archive/
    2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf

      [18] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf

   [19]WebNN Integration with real-time video processing [20]🎞︎

      [19] https://github.com/webmachinelearning/webnn/issues/226
      [20] https://www.youtube.com/watch?v=qSlXLqouxCs#t=167

    [21][Slide 10]

      [21] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=10

    [22][Slide 11]

      [22] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=11

    [23][Slide 12]

      [23] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=12

    [24][Slide 12]

      [24] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=12

    [25][Slide 13]

      [25] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=13

    [26][Slide 15]

      [26] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=15

    ningxin: slide 15 is high level pipeline to build a background
    blur video pipeline

    Two implementations: WebGL and WebGPU/WebNN.

    texture uploads to GPU in both cases.

    Last shader is taking 3 input: original image, blurred image,
    and computed segmentation map.

    Description of the perf issues, in particular CPU usage and GC.

    Bernard: is there a copy on the output at offscreencanvas
    level?

    Ningxin: not sure.

    Tim: is the perf acceptable? or do we need to make massive
    improvements?

    ningxin: we need to measure battery impact

    dom: we are doing this prototype to evaluate what HW
    acceleration can bring us. And identify potential roadblocks
    when trying to do video processing on media capture

    for instance color conversion or pixel format.

    youenn: looking at 20% CPU on GC - can that be fixed by
    implementations, or is it an architectural issue with having
    lots of objects created per frame?
    … on native, there is usually a buffer pool to help with that
    … does that need to be surfaced to the JS, or can that be dealt
    solely by the UA?

    ningxin: GPUBuffers are created beforehand. Some objects are
    created for every frame, like textures.

    There are ways to avoid many object allocations.

    at JS level. Maybe UA optimisations might help.

    dom: what are the next steps for this project?

    ningxin: 1. enable WebGPU backend.

    2. new APIs that allow import frames as GPU textures and see
    whether that will improve efficiency.

    3. Improve VideoFrame GC PR: we will try out when it is merged
    in Chrome.

    youenn: re CPU efficiency - this is moving between main thread
    and worker thread, that may have a small perf impact
    … doing everything in the worker might be helpful once that's
    possible

   [27]WebRTC Extensions [28]🎞︎

      [27] https://github.com/w3c/webrtc-extensions
      [28] https://www.youtube.com/watch?v=qSlXLqouxCs#t=1760

    [29][Slide 19]

      [29] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=19

     Issue [30]#95 [31]🎞︎

      [30] https://github.com/w3c/webrtc-extensions/issues/95
      [31] https://www.youtube.com/watch?v=qSlXLqouxCs#t=1780

    [32]Media Capabilities issue 185

      [32] https://github.com/w3c/media-capabilities/issues/185

    [33][Slide 20]

      [33] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=20

    [34][Slide 21]

      [34] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=21

    [35][Slide 22]

      [35] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=22

    [36][Slide 23]

      [36] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=23

    [37][Slide 24]

      [37] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=24

    Bernard: question to WG is: is it a goal for MC to deprecate
    getCapabilities?

    youenn: my understanding is that media capabilities is really
    about audio/video capabilities
    … so it doesn't make sense to expose e.g. CN there
    … they should stay in WebRTC getCapabilities
    … getCapabilities() being sync is problematic; that's less of
    an issue for software capabilities such as CN
    … so deprecated getCapabilities fully is not a goal, but
    partially, yes

    Florent: +1 on the approach usability of resulting split is a
    concern

    chris: seems fine to use that split; do we want to return rtc
    codec info from media capabilities?
    … if so, please take at look at [38]https://github.com/w3c/
    media-capabilities/issues/185

      [38] https://github.com/w3c/media-capabilities/issues/185

    youenn: +1 on disambiguating the outcome of this situation
    … listing all codecs in just one call is a non goal
    … an SFU is typically only interested in a few codecs
    … for P2P, setCodecPreferences is probably not needed in the
    first place - you can deal with a generic codec negotiation

    jib: would be good to clarify if we want to deprecate "real"
    codecs from getCapabilities? this sounds like a good long term
    goal for me

    harald: I worry that RTX/RED/FEC info needs to be available
    somewhere
    … getCapabilities has known problems and would be the only way
    to get it
    … changing getCapabilities is actually harder to deprecating it
    … in the long run, it's best to deprecated getCapabilities and
    replace it with a better dedicated API

    Florent: two different scenarios for setCodecPreferences:
    talking with an SFU in which case you can make specific codec
    queries; in a P2P scenario, if you can't enumerate all the
    codecs, you won't be able to call setcodecpreferences
    … this would require hardcoding a list of codecs
    … is there a way to make getCapabilities evolve in a shape that
    would satisfy everyone?
    … getCapabilities+setCodecPreferences has a lot of current
    usage, will be hard to deprecated

     Issue [39]#100 [40]🎞︎

      [39] https://github.com/w3c/webrtc-extensions/issues/100
      [40] https://www.youtube.com/watch?v=qSlXLqouxCs#t=2747

    [41][Slide 25]

      [41] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=25

    [42][Slide 27]

      [42] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=27

    youenn: might be fine, but I worry about the defaults? would
    they be the same across browsers?
    … there are current codecs that are defaults, but that may need
    to evolve over time
    … this could create Web compat issues

    Sergio: some of the codecs are receive-only
    … the list would be based on common sense, but I don't have a
    strong opinion

    youenn: my worry is about P2P - if the defaults aren't same
    across UAs, the negotiation will fail

    sergio: my suggestion was to use defaults in the offer, and
    adapt the answer based on the offer

    harald: two interfaces needed: the list of codecs currently
    willing to offer, the set of codecs you can offer
    … the 1st one might be getCapabilities, the proposal on the
    slide for the 2nd
    … in terms of interop, MTI codecs should be the safety net, and
    they should be in the mandatory-to-offer

    florent: the proposal seems ot have a lot of overlap with
    setCodecPreferences / getParameters - could we improve these
    instead of coming up with new API

    [Philipp supports this on the chat]

    Sergio: would be fine; I started from the rtp header
    extensions, maybe that should apply there?

    florent: the difference is that there is already an API to set
    codec preferences

    sergio: but header extensions could be added there too?

    Bernard: let's continue the discussion in the issue
    … or work on a matching PR

   [43]https://github.com/w3c/mediacapture-extensions/issues/47 Voice
   Isolation Constraint [44]🎞︎

      [43] https://github.com/w3c/mediacapture-extensions/issues/47
      [44] https://www.youtube.com/watch?v=qSlXLqouxCs#t=3316

    [45][Slide 41]

      [45] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=41

    [46][Slide 42]

      [46] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=42

    Resolution for issue 95: mark issue as ready for PR

    [47][Slide 43]

      [47] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=43

    [48][Slide 44]

      [48] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=44

    [49][Slide 45]

      [49] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=45

    youenn: it makes sense; reasonable to ignore `noiseSuppression`
    … there is also `echoCancellation` in the audio pipeline
    … does it make sense to do `echoCancellation` when this is set?

    harald: I think it's mostly orthogonal

    youenn: so `echoCancelation: false` is compatible with
    `voiceIsolation: true`
    … it may be challenging for some implementations to support
    these combinations

    jan-ivar: I like this too; what should the default be? that may
    bring concerns

    harald: we can discuss this in the PR
    … conservatively, the default should be the current behavior
    (false)

    dom: instead of boolean, we could use strings for extra
    flexibility.

    RESOLUTION: mark voiceIsolation [50]issue #47 as ready for PR

      [50] https://github.com/w3c/mediacapture-extensions/issues/47

   [51]support for contentHint in Capture Handle [52]🎞︎

      [51] https://github.com/w3c/mediacapture-handle/issues/35
      [52] https://www.youtube.com/watch?v=qSlXLqouxCs#t=3867

    [53][Slide 48]

      [53] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=48

    [54][Slide 49]

      [54] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=49

    [55][Slide 50]

      [55] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=50

    [56][Slide 51]

      [56] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=51

    youenn: setting the track hint is unnecessary - if the capturer
    is setting the hint on its side, the UA knows that the track
    being captured is text - there is no need to transmit it to the
    capturer
    … except maybe if WebCodecs is the picture
    … having the UA taking care of this seems preferable

    elad: the suggestion would be that the captured content
    self-declare its type and the UA uses it?
    … but that removes the liberty of the capturer to decide
    whether to use the hint or not
    … which could be based on e.g an allowlist
    … autodetection by the UA would have its own limitation

    bernard: re the WebCodecs case - contentHint is not
    automatically consumed by WebCodecs, it's up to the app to
    apply it as codec setting

    jib: I agree with youenn that the UA is in good place to
    shortcircuit the capturer part
    … the proposal could be useful for the capturee side
    … exposing further metadata to the controller might be an
    interesting addition to my capturecontroller proposal

    youenn: it could be exposed at the videoframe level

    jib: I see agreement on the need, not yet on the API shape

   [57]WebRTC Extensions [58]🎞︎

      [57] https://github.com/w3c/webrtc-extensions
      [58] https://www.youtube.com/watch?v=qSlXLqouxCs#t=4571

   [59]Avoid user-confusion by avoiding offering undesired audio sources
   [60]🎞︎

      [59] https://github.com/w3c/mediacapture-screen-share/issues/219
      [60] https://www.youtube.com/watch?v=qSlXLqouxCs#t=4583

    [61][Slide 54]

      [61] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=54

    [62][Slide 55]

      [62] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=55

    [63][Slide 56]

      [63] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=56

    Tim: is this only applicable for echo management?

    elad: it could be that an application is interested in
    recording a specific tab, no more than that.

    Tim: this use case does not seem address: identifying the
    desired tab would be needed.

    Elad: some VC applications usually do not want to capture
    system audio.

    Jan Ivar: supportive, how about reusing displaySurface
    constraint here?

    Elad: Might work for me.

    Jan Ivar: I would like to remove monitor from here.

    dom: if we do not include monitor here, audio: true might
    capture system audio. But applications would not be able to
    explicitly ask for system audio.

    dom: displaySurface would be a strange name for audio.

    youenn: let's enumerate the different approaches:
    avoidSystemAudio, displaySurface, sources

    youenn: scope is unclear, we need to clarify this before going
    to PR.

    youenn: different properties allow to do feature detection on
    what kind of recording the UA can do

    elad: my focus is only limiting access to system audio, but I
    also think flexibility is helpful

    timp: back to my echocancellation point - the constraint could
    be linked to whether the source can be echocancelled

    Harald: source being echo cancellable is a second concern.
    Biggest point is avoiding system audio.

    Tim: as well as window audio.

    harald: echoCancellation is a secondary concern - capturing
    system audio could disclose info from a 3rd party

   [64]Region Capture [65]🎞︎

      [64] https://github.com/w3c/mediacapture-region
      [65] https://www.youtube.com/watch?v=qSlXLqouxCs#t=5905

    RESOLUTION: continue discussions on GitHub.

    [66][Slide 59]

      [66] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=59

    youenn: [67]#11 is an issue on the shape of the CropTarget API
    … given current chrome implementation work, feels it's useful
    to converge on the API shape

      [67] https://github.com/w3c/mediacapture-region/issues/11

    [68][Slide 60]

      [68] 
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=60

    youenn: do we want to attach the API to element or to
    MediaDevices?
    … element feels like a better path

    jib: +1

    elad: I prefer mediaDevices given its linkage to screen capture

    youenn: cropTarget is linked to MediaStreamTrack, not
    mediaDevices
    … and it's really tied to an element

    elad: it can be used through an object you get from
    getDisplayMedia

    youenn: but with a detached mediaDevices, you can't reject the
    promise

    dom: prefer element option.

    youenn: next question is attribute vs method
    … slight pref for attribute, but no strong feeling

    elad: there is a cost to minting a crop target - we mark the
    element in the rendering pipeline in specific ways that we
    shouldn't abuse

    youenn: I thought you were going to use a lazy approach to
    reduce that cost

    elad: lazy tagging might help, but this needs more thinking

    jib: +1 to attribute
    … developers value trump implementators value

    elad: I don't think it matters much to developers in the first
    place

    harald: disagree with messing with the element interface, and
    on hiding the fact that the operation has a cost
    … also async (promises) may be needed for some implementations
    … let's not hide the reality of the situation

    jib: the cost seems to be Chrome-specific
    … the real goal of this API is a transferable reference

    youenn: +1
    … other APIs in the past have re-used the element interface,
    have made similar decisions on methods / attributes, async vs
    sync
    … we should follow existing implemented patterns

    dom: is there any other API that may be use this tranferable
    reference?

    youenn: that's something I bring up in the issue

    elad: this may create unsafe usage for this well-defined target

    jan-ivar: this could be evaluated

    hta: but this shouldn't block progress on the specific narrow
    goal we have

    youenn: my focus is aligning with current API patterns for this
    API

    elad: the TAG will chime in; but if they don't give a clear
    specific suggestion
    … we could move with the current design that can be polyfilled
Received on Monday, 2 May 2022 13:19:28 UTC