[minutes] WebRTC WG Meeting November 21st 2023


The minutes of our meeting yesterday (Nov 21st) are available at:
with the video recording at:

Text copy of the minutes below.


                       WebRTC November 2023 meeting

21 November 2023

    [2]Agenda. [3]IRC log.

       [2] https://www.w3.org/2011/04/webrtc/wiki/November_21_2023
       [3] https://www.w3.org/2023/11/21-webrtc-irc


           AlfredHeggestad, Bernard, Carine, Dom, EeroHäkkinen,
           Elad, Fippo, Florent, Guido, Harald, Henrik, Jan-Ivar,
           PatrickRockhull, PaulAdenot, PeterThatcher, Riju,
           Sameer, StefanHolmer, SunShin, TimPanton, TovePetersson,


           Bernard, HTA, Jan-Ivar



     1. [4]Mediacatpure-extensions
          1. [5]Issue #121: Background Blur: Unprocessed video
             should be mandatory
          2. [6]Issue #129: [Audio-Stats] Disagreement about audio
             dropped counters
     2. [7]WebRTC Grab Bag
          1. [8]Issue 146 Exposing decode errors/SW fallback as an
          2. [9]Issue 92: Align exposing scalabilityMode with
             WebRTC “hardware capabilities” check
     3. [10]SDP Issue 186: New API for SDP negotiation
     4. [11]RtpTransport
          1. [12]Bandwidth Estimation
          2. [13]Forwarding
          3. [14]Using existing m-lines
     5. [15]Multi-mute (Three Thumbs Up - Setup)

Meeting minutes

    Recording: [16]https://youtu.be/xJMXnf3Qwh8

      [16] https://youtu.be/xJMXnf3Qwh8



    Slideset: [18]https://lists.w3.org/Archives/Public/www-archive/


   [19]Mediacatpure-extensions [20]🎞︎

      [19] https://github.com/w3c/mediacapture-extensions/
      [20] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=93

     Issue [21]#121: Background Blur: Unprocessed video should be
     mandatory [22]🎞︎

      [21] https://github.com/w3c/mediacapture-extensions/issues/121
      [22] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=93

    [23][Slide 11]


    Hta: related to "three thumbs up" that will be presented later
    in the agenda

    Elad: that one will focus on the mute state

    Hta: it would be much easier when there are multiple layers
    that can have an opinion on whether an effect will be applied
    if there was a way to disable the effect

    Riju: this sounds nice, but not clear that can be supported on
    MacOS at the platform level

    Youenn: it's correct that there are OSes where this would not
    be possible at the moment
    … this "MUST" would not be implementable on these OSes
    … making it a SHOULD dependent on it being feasible (e.g. some
    cameras may not allow it)

    HTA: we could have a MUST with a note that this won't be
    implemented everywhere

    Youenn: the risk would be to discourage implementation of
    background blur

    Youenn: SHOULD would be preferable

    HTA: I'll try to phrase something making it clear it would only
    be if there is a good reason not to

    Elad: background blur is a problem we want to solve, but one of
    several issues related to these effects and post-processings
    that can be applied by the UA or the OS
    … it's a complicated space, we shouldn't tie our hands too
    early; we could come back with a more generalized approach in

    HTA: next time I expect there will be a specific proposal to

    Bernard: for audio, we had a way to request specifically
    unprocessed audio
    … it helps in that it is forward compatible

    Jan-Ivar: capabilities should reflect what can and cannot be
    … not try to impose what the OS can do

    HTA: we've helped push VP8 support at the OS level, so this
    wouldn't be unprecedented

    TimP: if I had a system setting to enforce background blur, I
    wouldn't want the UA to override it
    … this has privacy impact
    … this also illustrates that there may be different rules
    across capabilities

    HTA: discussion to continue on the bug with considerations on
    privacy, OS-implementability, and harmonization across

     Issue [24]#129: [Audio-Stats] Disagreement about audio dropped
     counters [25]🎞︎

      [24] https://github.com/w3c/mediacapture-extensions/issues/129
      [25] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=644

    [26][Slide 12]


    Henrik: in previous meetings, we agreed on a set of audio stats
    to measure glitches

    [27][Slide 13]


    Henrik: look for a decision on whether to expose audio frame
    drops to JS

    [Fippo emoji-supports]

    TimP: +1 on exposing it; there are ways to react too (e.g.
    turning up FEC, changing p-time, ...)
    … it's useful for the app to see this

    youenn: it's fine as long as there is no fingerprint issues -
    we should look at that
    … e.g. maybe audio drop patterns may identify a device
    … we should have the same discussion with videoframe drops
    … on macos there is no way for video frames to be dropped

    Paul: this is about local things - no network or packets
    … i.e. microphone to the page only
    … re fingerprinting, if there is a certain device showing
    issues, the UA is responsible for dealing with it
    … FF cannot drop audio between microphone and the Web page - it
    cannot happen
    … except if the machine is overloaded with real-time threads,
    but that's an edge case since at that point other threads
    wouldn't work either
    … so this feels like UA architectural bugs, and thus not
    something that should be exposed to the Web
    … for videos, dropping frames would be expected since the
    perceptual constraints are different

    Henrik: any time we do a perf related experiment, we see issues
    … I don't think we can mandate that these issues can't happen
    … even if it was only a UA bug, there would still be value in
    exposing this for e.g. bug report

    Harald: 3 parties in this game: the platform, the UA, the app
    … all 3 have opportunities to mess up audio
    … independently
    … many UAs are running on multiple OS, multiple versions of OS
    with different features and different bugs
    … there will be many cases of these combinations
    … Paul mentioned instrumentation and telemetry
    … the use case of WebRTC is so small that you have to have
    dedicated telemetry to make it show up at all
    … having the ability to have the application report on what
    happens when *it* runs, and not in the general case when the UA
    runs is important
    … my conclusion is that we should expose this to JS

    TimP: is it really the case that changing the framerate and
    encoder settings would have no impact?

    Paul: this is before encoding

    Harald: this would have an impact in the PC

    Henrik: adding encoder load could have an impact

    TimP: surely the sample rate from the microphone is affected by
    … overall, it's not implausible you could influence it from the
    app; but in any case, would be good to have the information

    youenn: native apps have access to this info
    … the app could decide e.g. mute capture in some circumstances
    … unless there are security or privacy issues, I don't see a
    reason not to expose it to Web apps as well
    … in terms of telemetry, the app could have telemetry of its
    … I still support implementing it

    Paul: there is nothing you can do in a Web page that will
    change the complexity of the CPU usage on the real-time thread
    that is used for input, except disabling @@@

    Henrik: we have real-world data from Google with playout
    glitches due to lost frames; software/hardware encoder has an
    impact, also quality of device

    HTA: I'm hearing rough consensus except for Paul

    Jan-Ivar: I support Paul; FF would not implement this API since
    we don't think it's needed

   WebRTC Grab Bag [28]🎞︎

      [28] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=1722

     [29]Issue 146 Exposing decode errors/SW fallback as an event

      [29] https://github.com/w3c/webrtc-extensions/issues/146
      [30] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=1722

    [31][Slide 14]


    youenn: looking at the privacy comment, I'm not reading it as
    "this is fine"

    youenn: I don't see how we could isolate these events across
    … we could try to go with a PR, but this will need a closer
    privacy look before merging

     [32]Issue 92: Align exposing scalabilityMode with WebRTC “hardware
     capabilities” check [33]🎞︎

      [32] https://github.com/w3c/webrtc-svc/issues/92
      [33] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=1870

    [34][Slide 15]


    [35][Slide 16]


    [36][Slide 17]


    Bernard: some of the modes are supported, but not as "smooth"
    or "powerEfficient" despite the machine being high specs
    … the hardware acceleration is not exposed for SVC

    Henrik: in ChromeOS we can do some of the SVC modes in power
    efficient; L1T1 in Windows
    … there is what the device can do and what the UA has
    implemented, and how accurately this is represented in
    … on Windows lots of devices where L1T2 is available but not
    exposed in the UA yet

    Bernard: webrtc-pc doesn't expose whether something is
    powerEfficient, only if it is supported

    [37][Slide 18]


    Bernard: proposal is to bring something back to SVC if/when
    media capabilities limits this exposure

    Jan-Ivar: hardware support being limited today doesn't mean it
    will be tomorrow
    … but in general, +1 to bringing this to media capabilities
    … the "capture check" is not necessarily a good fit for all use
    cases (e.g. games over data channel)
    … it also risks driving to escalating permissions

    Henrik: I'm hearing agreement that media capabilities should
    solve this

   [38]SDP Issue 186: New API for SDP negotiation [39]🎞︎

      [38] https://github.com/w3c/webrtc-encoded-transform/issues/186
      [39] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=2385

    [40][Slide 24]


    HTA: same functionality, different API shape

    [41][Slide 25]


    [42][Slide 26]


    [43][Slide 27]


    HTA: having the API on the transform, the app would have to
    talk to the transform, and the transform have to talk to the
    SDP which seems bad

    Jan-Ivar: the Transform between the encoder and the packetizer
    is not the one we're talking about
    … we're talking about RTCScriptTransform which runs on the main
    thread and which would say "here is how this transform should
    be exposed as in terms of codecs"
    … it's an inherent property of the Transform
    … I don't think we should organize APIs around SDP but around
    the functionalities as perceived by Web developers

    HTA: the purpose of isolating SDP is to not entangle SDP with
    stuff that we might want to use without SDP
    … keeping a distinction is a good thing in that sense

    Jan-Ivar: transceiver.sender/receiver are always created at the
    same time

    HTA: at the moment yes

    Youenn: at some point we'll have SFrame packetization that
    we'll likely want to use in connection with SFrameTransform
    … I wonder if looking how SFrame would work in this model would
    help establish the underlying infrastrcture

    HTA: when installing an SFrameTransform, SDP has to be
    affected, but the sframe spec doesn't say how yet

    Bernard: the SFrame packetization spec covers RTP

    Peter: it's a work in a progress

    HTA: not sure we can depend on that progress to drive our
    architectural decisions

    Henrik: in my mind, a transceiver and an m-section map
    … in Jan-Ivar's proposal where the Transform contains
    information about SDP - would the only difference be that the
    transceiver ask the Transform what is its payload type?
    … does it make a huge difference between the two? e.g. will it
    be the same number of methods

    Jan-Ivar: my proposed API is much simpler - it's one step done
    through the constructor of the transform
    … it's not true that Transceiver encompasses all the
    negotiation needs (e.g. addTrack / createDataChannel)
    … using two codec names is confusing - the packetization should
    be a subproperty of the codec

    HTA: I look forward to a more flushed out proposal in that

    [44][Slide 28]


    Fippo: I would say the PT
    … since the mime type is the result of the lookup of the
    payload type

    Jan-Ivar: I was going to say the opposite, but Fippo's point
    makes sense
    … how would you find the PT?

    Harald: go through the codecs from getParameters

    Henrik: if if it's one-to-one mapping, the ergonomics would go
    for mime type

    HTA: if we move a frame between PC, the PT may not have the
    same meaning, when the mime type does

    youenn: I would go with PT; PT and mime going out of sync may
    suggest there should be a different field for packetization

    HTA: you could specific that once you enqueued a frame, it sets
    the other based on the other and ignores it if it can't find
    the mapping

    [fippo: +1]

    HTA: I'm hearing slightly favorable to mime type, but this
    needs more discussion - will summarize it on the github issue

   [45]RtpTransport [46]🎞︎

      [45] https://github.com/w3c/webrtc-rtptransport
      [46] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=3615

    [47][Slide 31]


    [48][Slide 33]


     Bandwidth Estimation [49]🎞︎

      [49] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=3683

    [50][Slide 35]


    [51][Slide 36]


    Bernard: the RtpTransport is both for sending and receiving -

    Peter: right

    Stefan: have you thought about what it would look like for a
    user of this API that would want to implement its own bandwidth
    control in the Web app?
    … would this done through this API or something else?

    Peter: I have thought about that, think we should discuss it,
    but not today :)

    Jan-Ivar: the other transports have back-pointers from the
    transport; shouldn't this be under the dtlsTransport

    Peter: the dltsTransport should be under the rtpTransport, but
    we can't change this at this point

    Orphis: with Bundle semantics, I think you really need it on
    each media section

     Forwarding [52]🎞︎

      [52] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=3939

    [53][Slide 38]


     Using existing m-lines [54]🎞︎

      [54] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=3984

    [55][Slide 40]


    [56][Slide 41]


    [57][Slide 42]


    HTA: this crosses multiplexing
    … I would rather not make it too easy to go over these bumper

    Jan-Ivar: we're already way too low level than I'm comfortable
    with; this is a very low level API
    … if every sender has an RTPTransport, does it also show up if
    it has a track
    … I was imagining more of a new type of datachannel
    … this would mutually exclusive with the track API
    … rather than exposing a JS API to programatically send packets
    … the benefits of using a writable we're seeting for
    WebTransport helps let the UA manage the throughput and keep
    … I'm looking for an API that allows to change the source of
    RTP rather than control of RTP

    Florent: how would the API work with the various ways of
    encrypting data in an RTP packet, e.g. cryptex

    Peter: we haven't talked yet about encryption of header
    extensions - a good topic to cover, like Stefan's; this would
    need more time

    Henrik: you could register to send for a specific payload

    Peter: I'd need to see examples of what you're thinking to help
    … I think a more in-depth design meeting would be useful

    Bernard: the point about cryptex makes it clear that there
    needs to be a facility to create a fully-formed packet for the
    … WhatWG streams won't do SRTP when you call write

    Dom: so the Chairs should figure a separate meeting for more
    in-depth design discussions

   [58]Multi-mute (Three Thumbs Up - Setup) [59]🎞︎

      [58] https://github.com/w3c/mediacapture-extensions/issues/39
      [59] https://www.youtube.com/watch?v=xJMXnf3Qwh8#t=4955

    [60][Slide 58]


    Elad: we'll focus on muting today

    [61][Slide 59]


    [62][Slide 60]


    [63][Slide 61]


    [64][Slide 62]


    Elad: this is a problem worth solving

    [65][Slide 63]


    [66][Slide 64]


    Elad: I think exposing upstream state is more important than
    changing it

    [67][Slide 65]


    Elad: the mute event doesn't suffice - it is defined as a
    temporary inability to provide media
    … muted refers to the input to the mediastreamtrack

    [68][Slide 66]


    [69][Slide 67]


    [70][Slide 68]


    [71][Slide 69]


    [72][Slide 70]


    jan-ivar: hear 3 proposals: muteReasons, potentiallyActionable,
    … I support requestUnmute
    … I think muteReasons should only have 2 states; things outside
    of the browser can be correlated across origins
    … regarding requestUnmute, we already have a toggleMicrophone
    in the media session API

    Elad: re muteReasons, having more granularity would be nice to
    deal with the diversity of OSes

    Jan-Ivar: the UA would be responsible to deal with the upstream
    OS when it can't deal with requestUnmute on its own

    Youenn: I see convergence on requestUnmute
    … depending on how scary calling that method would be for the
    Web app, it may impact how much information to expose in
    … in terms of the boolean, some of the definitions wouldn't be
    implementable e.g. in iOS
    … hence why we should focus on requestUnmute first
    … requestMute would be nice for later

    Elad: requestMute would risk muting other apps - but it feels
    orthogonal anyway
    … re boolean, see [slide 66]
    … if we don't expose the distinction between UA and OS (e.g.
    call it "upstream"), would that work for you?

    Youenn: I would want to better understand requestUnmute
    … I believe that will help drive the discussion on the boolean
    value - I'm not opposed to it

    Elad: I would argue that the MuteSource is useful even if
    requestUnmute is never called

    Youenn: the case that is interesting is the one where the user

    Guido: right now, the spec doesn't define muted the way Youenn
    suggests; any interruption in the capture cause a muted event
    … it doesn't reflect user intent

    Jan-Ivar: the examples from the spec refer to user-intended
    … maybe we should fix the spec to allow the distinction between
    a "temporal" mute and a user-intented mute

    Elad: changing the spec and risking to break existing
    implementations will be painful compared to just adding a

    Jan-Ivar: I would be happy to propose slides with toogleMic /
    … mutesource has value (but not without so many values)

    Harald: OS capabilites change over time, we shouldn't limit
    ourselves to these current capabilities

    Guido: re media session, would this be re-using the event or
    interacting with the media session API?

    Jan-Ivar: the event

    Bernard: given how much content we didn't cover, should we
    schedule another meeting in December?

Received on Wednesday, 22 November 2023 09:08:35 UTC