Re: [minutes] June 18 meeting

Video recording:

On Wed, Jun 19, 2024 at 8:38 AM Dominique Hazael-Massieux <>

> Hi,
> The minutes of our meeting held yesterday (June 18) are available at:
> and copied as text below.
> Dom
>                        WebRTC June 18 2024 meeting
> 18 June 2024
>     [2]Agenda. [3]IRC log.
>        [2]
>        [3]
> Attendees
>     Present
>            Bernard, Dom, FlorentCastelli, Harald, Jan-Ivar,
>            JianjunZhu, PatrickRockhill, PeterThatcher, Sameer,
>            SunShin, TimP, TonyHerre, Youenn
>     Regrets
>            -
>     Chair
>            Bernard, HTA, Jan-Ivar
>     Scribe
>            dom
> Contents
>      1. [4]Future meetings
>      2. [5]Proposal: merging more extensions in WebRTC Rec
>      3. [6]WebRTC Charter Renewal
>      4. [7]Media Capture Extensions
>           1. [8]Issue #145 Consider adding onVoiceActivity event on
>              MediaStreamTrack for audio
>           2. [9]Issue #149 How to select camera presets that have
>              better power efficiency at the expense of quality?
>      5. [10]ICE improvements: send and prevent ICE check over a
>         candidate pair
>      6. [11]WebRTC: PC.local_description and friends - snapshot
>         views or dynamic views?
>      7. [12]Should devicechange fire when the device info changes?
>      8. [13]Summary of resolutions
> Meeting minutes
>     Slideset: [14]
>     2024Jun/att-0002/WEBRTCWG-2024-06-18.pdf
>       [14]
>    Future meetings
>     Bernard: there has been discussion of canceling our scheduled
>     meeting on July 16 and instead meet on August 27
>     … any objection to that plan?
>     [none expressed]
>     RESOLUTION: cancel July 16 meeting and schedule one on Aug 27
>    [15]Proposal: merging more extensions in WebRTC Rec
>       [15]
>     [16][Slide 10]
>       [16]
>     Dom: any comment or objection to the proposal?
>     Jan-Ivar: agree on the extension spec being confusing
>     … this isn't a charter change, right?
>     Dom: correct
>     Jan-Ivar: no objection, wonder if this could be used to
>     mediacapture-main pre-Rec
>     Dom: yes, the process would allow for that
>     Bernard: what does reasonable test coverage encompass?
>     Dom: Basically enough to have confidence the implementation
>     reasonably matches what we've specified
>     TimP: should we consider non-browser implementations in this
>     context?
>     Dom: That may be something to consider independently of this
>     policy; but my personal sense is that this WG should focus on
>     browser interop, even though other implementation contexts are
>     a great secondary impact of this work
>     Jan-Ivar: +1
>     Youenn: what constitutes a commitment? in some cases, it may be
>     we're willing to implement but with unclear or a moving
>     implementation roadmap
>     Dom: commitment could be a bug tracking link, a standards
>     position from an implementor
>     RESOLUTION: proceed with adopting new merge-guide guidance
>    [17]WebRTC Charter Renewal
>       [17]
>     [18][Slide 11]
>       [18]
>     Dom: any objection to proceeding with that plan?
>     RESOLUTION: Proceed with editorial renewal of the WebRTC WG
>     charter
>    [19]Media Capture Extensions
>       [19]
>      Issue [20]#145 Consider adding onVoiceActivity event on
>      MediaStreamTrack for audio
>       [20]
>     [21][Slide 15]
>       [21]
>     Jianjun: is there support for adding such an event?
>     Youenn: the "unmute microphone" hint is a great use case; the
>     media session API will help to fully mute the microphone, so
>     helping to expose that state to the end user is a great idea
>     … there is OS support for it already; so this is a valid use
>     case
>     … I'm not sure we can limit processing to only when the user is
>     actively speaking since it may be too late to do the processing
>     once the event has been received
>     Jan-Ivar: the two use cases may lead to different solutions; I
>     very strongly support the 1st use case, the 2nd may have
>     privacy implications
>     … I'm not even sure you need a constraint for that - maybe if
>     it's onerous for implementations to do that
>     Youenn: there might be use cases via audio stats, e.g. there is
>     an rtp header extension to expose voice activity as a boolean,
>     typically set by the encoder
>     … providing that in real-time may make sense, but not sure if
>     the event is the right approach, this will need more
>     experimenting
>     Bernard: in terms of privacy implications, determining
>     start/end of speech occurrences may reveal more information to
>     the app than the user would expect
>     … we could fire only "start" and limit it to a given interval
>     … for the "unmute hint" use case, you don't need to know how
>     long the person is speaking, and it doesn't need to be super
>     frequent
>     Jianjun: this works for the 1st use case, but wouldn't help for
>     the 2nd use case where you need to know when to stop and start
>     processing
>     Tony: would a simple boolean suffice to give confidence on when
>     to show the hint? e.g. clearly speaking vs maybe speech in the
>     background
>     Jianjun: not sure - today looking for support for having such
>     an API or not
>     TimP: we need to be clearer on whether this is voice detection
>     or audio levels - they come with different requirements
>     … the 1st one is specifically about voice, and the more
>     specific the better
>     … the 2nd one would want to work with singing, groups of
>     people, etc
>     … this feels like 2 APIs
>     Jianjun: I'm hearing the 2nd use case needs more thinking
>     Jan-Ivar: +1 on these being separate use cases that require
>     separate situations
>     … The 1st one is a single event that fires when the microphone
>     is muted and there is voice activity
>     … The 2nd one might be too noisy an an event on the main
>     thread; it feels like metadata to attach to the audio, outside
>     of the main thread
>     Jianjun: I'm happy to focus on the 1st use case atm
>     Youenn: +1 to Jan-Ivar; 1st use cases focused on muted track,
>     2nd on unmuted; the 2nd could be in audio stats, although it
>     may be more usefully exposed in the audio worklet
>     … in any case, separating them, doing the 1st one quickly and
>     taking more time on the audio processing optimization
>     Harald: you could already skip processing audio frames with
>     only zeros, although that itself introduces some overhead
>     Jianjun: I'm hearing we should focus on the 1st use case, and
>     look into how to solve the 2nd use case separately
>     RESOLUTION: proceed with a pull request for the 1st use case,
>     and open a separate issue for optimizing audio processing
>      Issue [22]#149 How to select camera presets that have better power
>      efficiency at the expense of quality?
>       [22]
>     [23][Slide 16]
>       [23]
>     [24][Slide 17]
>       [24]
>     Dom: note that afaict there are no implementations of
>     powerEfficientPixelFormat, so we could replace it altogether at
>     this stage without backwards compat
>     Bernard: would powerEfficient always include the power
>     efficient pixel format? if not, it may be useful to retain the
>     granularity
>     Harald: this reminds me of content-hint - you don't want to
>     specify exactly what to do, but pushing the UA in a certain
>     direction
>     … pixelFormat feel too constrainted, a more general one to
>     replace it sounds like a good idea
>     … e.g. a powerQualityTradeoff constraint
>     Bernard: the pixelFormat feels different than the general power
>     efficiency
>     TimP: any reason the default should not be "powerEfficient"?
>     Youenn: this may break existing expectations e.g. for bar code
>     scanners, high quality podcast recordings
>     … it may be ideal to have that constraint default to true
>     eventually, but that may be hard to do in a backward compatible
>     manner, at least initially
>     … I'm hearing support for removing powerEfficientPixelFormat
>     and move toward a more general powerEfficient constraint
>     Jianlin: if two pages request a camera, one with powerEfficient
>     and the other not, what would happen?
>     Youenn: e.g. the UA could re-size the video, with a bit of a
>     quality loss
>     … My thinking is "short usage, quality; longer, power
>     efficiency"
>     Jan-Ivar: impact of power efficiency on barcode scanning?
>     Youenn: e.g. the autofocus might be more finicky in power
>     efficient setting
>     Guido: we could start by adding a new constraint and reconsider
>     removal of pixelFormat later - we don't have to do that at the
>     same time
>     RESOLUTION: Proceed with a pull request to add a powerEfficient
>     constraint
>    [25]ICE improvements: send and prevent ICE check over a candidate pair
>       [25]
>     [26][Slide 20]
>       [26]
>     [27][Slide 21]
>       [27]
>     Bernard: is consent freshness included in your concept of ICE
>     check here?
>     … if so, it may raise security issues
>     Sameer: I'll try to address that in the discussion of the API
>     … the API I want to propose addresses connectivity checks,
>     keepalive, and RTT determination, while conserving bandwidth &
>     power
>     [28][Slide 22]
>       [28]
>     Bernard: re consent freshness: is it considered a triggered
>     check? it's good that responses can't be prevent, but can the
>     app prevent consent freshness to go out?
>     Sameer: don't know off the top of my head, but +1 on making
>     sure the app can't prevent it
>     Jan-Ivar: can you describe the use cases to expose the API to
>     JS?
>     Sameer: this is more detailed in the issue description on
>     github
>     … STUN checks are sent for keepalive on the selected pair every
>     2s or so, but only every 15s or so on non-selected pairs
>     … this API allows to monitor the quality of different candidate
>     pairs and possibly switch e.g. if the RTT is lower
>     … we've tried this out on mobile and seen very different RTT
>     over the life of a session, across different network interfaces
>     for instance
>     Jan-Ivar: there is no way for the UA to be doing this on its
>     own?
>     Sameer: there is no obligation to the UA to be monitoring
>     quality over ICE candidates
>     Jan-Ivar: could this be a IceTransport option instead though?
>     instead of lots of new API surface
>     Sameer: picking an alternative may be hard to specify as a
>     configuration (e.g. if you want to avoid relays due to latency)
>     Harald: the UA cannot know what options the application might
>     have available that are not immediately visible
>     … the purpose is the to allow the app to control the use of
>     connections and figure the information it can get
>     … the UA algorithms have to evolve slowly and work well in the
>     general case
>     … this API will allow experiemntation and considerations of
>     things the UA cannot know
>     … it's a movement towards ensuring the app can do what it wants
>     to do while making the default behavior work well
>     Peter: re consent checks - I think it would be fine to disable
>     consent checks, this would only mean that it would stop the ICE
>     transport
>     … (i.e. it wouldn't be unsafe, but a bit of a footgun)
>     … in terms of the why, it would be very difficult for the UA to
>     know what to do or to express it as a configuration
>     [29][Slide 23]
>       [29]
>     [30][Slide 24]
>       [30]
>     Sameer: the event-based API doesn't tell you when a check was
>     sent if initiated by the ICE agent - the app may thus get a
>     failure it tries to send a check too soon after the ICE agent
>     [31][Slide 25]
>       [31]
>     TimP: there seems to be a big difference in scope, in terms of
>     the context of what the app will see
>     … the promises carries a lot of the context the event doesn't
>     … e.g. whether the response comes from its own request or not -
>     that feels like useful information to have - maybe this could
>     be added to the event as well
>     Sameer: the app could determine whether they sent it - if they
>     got a failure trying to request a check, they can know the one
>     in flight comes from the agent
>     … since there can only be one check in flight for a given pair
>     … we could also expose the transaction id in the return value
>     of the request (since it is exposed in the event)
>     Jan-Ivar: RTCIceCandidatePair is now an interface - the check
>     method could be moved there
>     … what is the transactionId array buffer
>     Sameer: it's the transaction id of the ICE check itself
>     Peter: it's 20 bytes in the STUN header that is useful for
>     debugging, it's unique to each check
>     Jan-Ivar: so the proposal is a check method that the UA has
>     done an ICE check on behalf of the app; could you cause network
>     spam by running this in a loop?
>     Sameer: rate limiting is called out; since the main use case is
>     to determine RTT / inactive pairs, 1s interval should be
>     sufficient
>     Jan-Ivar: I would go with the promise approach - the event
>     approach seems to be carrying a lot of state, which goes
>     against general API advice
>     Peter: what happens if an ICE check goes lost? is there a
>     timeout to determine it is no longer in flight?
>     Sameer: when a check is sent out, it's marked as in progress;
>     if it timeouts, it's marked as failed
>     Peter: it doesn't allow the app to decide what the timeout
>     should be; and we would want more than one check in flight in
>     general as during establishments
>     … as long as the browser is exposing rate limits, this should
>     be OK
>     Jan-Ivar: re event API shape, maybe the values could be exposed
>     on the icecandidatepair instead of the event
>     Sameer: this would be similar to what we can get with getStats
>     - I think it's more useful to expose immediate feedback with
>     the response
>     Sameer: overall, I'm hearing support for the promises approach,
>     and investigating multiple in-flight checks at the same time
>    [32]WebRTC: PC.local_description and friends - snapshot views or
>    dynamic views?
>       [32]
>     [33][Slide 29]
>       [33]
>     RESOLUTION: no objection to proposal to align spec with
>     implementations
>    [34]Should devicechange fire when the device info changes?
>       [34]
>     [35][Slide 30]
>       [35]
>     [36][Slide 31]
>       [36]
>     TimP: how about most recently inserted device?
>     Jan-Ivar: for it to be a strong signal, it has to be recent
>     … also there can be several devices (e.g. a camera that sports
>     a mic)
>     … but we can bikeshed the name on the issue
>     … or are you thinking of a different logic?
>     TimP: mostly suggesting a way to simplify the explanation of
>     the behavior
>     Jan-Ivar: inserted may have connotations that are no longer
>     accurate for all devices, but has a nice intent aspect to it
>     harald: it's a new feature - how can you detect it?
>     Jan-Ivar: you get an empty array if no devices were inserted
>     (vs no property if not implemented)
>     RESOLUTION: Continue discussion on PR towards merging
> Summary of resolutions
>      1. [37]cancel July 16 meeting and schedule one on Aug 27
>      2. [38]proceed with adopting new merge-guide guidance
>      3. [39]Proceed with editorial renewal of the WebRTC WG charter
>      4. [40]proceed with a pull request for the 1st use case, and
>         open a separate issue for optimizing audio processing
>      5. [41]Proceed with a pull request to add a powerEfficient
>         constraint
>      6. [42]no objection to proposal to align spec with
>         implementations
>      7. [43]Continue discussion on PR towards merging
>      Minutes manually created (not a transcript), formatted by
>      [44]scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).
>       [44]

Received on Wednesday, 19 June 2024 07:48:21 UTC