[minutes] December 10 2024 meeting from Dominique Hazael-Massieux on 2024-12-11 (public-webrtc@w3.org from December 2024)

From: Dominique Hazael-Massieux <dom@w3.org>
Date: Wed, 11 Dec 2024 08:52:52 +0100
To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <5bdad684-d2ef-44b0-91e8-97241ddc5f3c@w3.org>
Hi,

The minutes of our meeting held yesterday are available at:
   https://www.w3.org/2024/12/10-webrtc-minutes.html

and copied as text below.

Dom

                       WebRTC December 2024 meeting

10 December 2024

    [2]Agenda. [3]IRC log.

       [2] https://www.w3.org/2011/04/webrtc/wiki/December_10_2024
       [3] https://www.w3.org/2024/12/10-webrtc-irc

Attendees

    Present
           Bernard, Carine, Guido, Harald, Jan-Ivar, Peter,
           SunShin, TimP, Tove, Youenn

    Regrets
           -

    Chair
           Bernard, Guido, Jan-Ivar

    Scribe
           dom

Contents

     1. [4]Captured Surface Switching
          1. [5]Auto-pause
          2. [6][screen-share] Issue 308: Should screen capture
             tracks expose deviceId?
          3. [7]Back to Auto-pause
          4. [8]Cross-type surface switching
     2. [9]Timing info for Encoded Frames
     3. [10][webrtc] Issue 3014: Spec says to send black frames for
        ended tracks
     4. [11][mediacapture-extensions] Reduce scope of
        MediaStreamTrack transfer to DedicatedWorker for now
     5. [12]Issue 115: What is the expected timing of MSTP video
        frame enqueuing with other track events
     6. [13][mediacapture-output] Issue 147: Implicit consent via
        getUserMedia should allow access to non-miked speakers
     7. [14]Summary of resolutions

Meeting minutes

    Slideset: [15]https://docs.google.com/presentation/d/
    1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/
    edit#slide=id.g2bb12bc23cb_0_0 ([16]archived PDF copy)

      [15] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0

   Captured Surface Switching

    [17][Slide 10]

      [17] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#10

     [18]Auto-pause

      [18] 
https://github.com/w3c/mediacapture-screen-share-extensions/issues/15

    [19][Slide 12]

      [19] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#12

    [20][Slide 13]

      [20] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#13

    [21][Slide 14]

      [21] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#14

    [22][Slide 15]

      [22] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#15

    [23][Slide 16]

      [23] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#16

    [24][Slide 17]

      [24] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#17

    [25][Slide 18]

      [25] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#18

    [26][Slide 19]

      [26] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#19

    Youenn: re slide 17, let's take an example of a Web app wanting
    to pause to ask the user to confirm its intent to share the
    content
    … you want to pause sending to remote parties, but want to keep
    the local view of the shared content
    … you need different behaviors for the two

    Tove: with the callback, you can do the same thing as with the
    event

    Youenn: but that shows that freezing the video would be
    hack-ish in both situations
    … in a PC, you can use setParameter or replaceTrack() to handle
    this
    … if you're in a call with a local preview with a track in the
    worker sent to the remote party
    … you want to stop sending the track in the worker
    … the guarantee you get from the event makes it a better API;
    you may not even need to pause the track
    … I prefer the event, would be OK with adding the callback
    although not sure about having it async, although maybe the
    cropTarget makes it compelling
    … but would want to understand it better
    … otherwise would prefer a sync callback
    … in terms of ordering, it would be better to first fire the
    event (ensuring the settings are updated) and then the callback
    can access accurate information

    Jan-Ivar: the UI use case can be solved with either API; the
    only need for async would be for replaceTrack or cropTo
    … configurationchange exists today, mostly to adapt consumer of
    a track to its properties, with throttling to avoid correlating
    across origins
    … proposal 1 feels simpler
    … also deviceId is not currently specified at the moment, so
    wouldn't be exposed via configurationchange at the moment
    … also the event may not be clear about their causes which
    might create unexpected bugs for developers who wouldn't handle
    all situations

    Tove: the configurationchange event is only for a few cases
    today

    Youenn: the event should be fired whenever the settings are
    changed without involvement from the application
    … e.g. when the user agent or the OS enables blur - the app
    should be able to keep track of that
    … this matches what happens when a surface changes (outside of
    the Web app control)
    … re fuzzing, I don't think it's useful in this context - we
    should remove the related note
    … when you're processing in the worker, having the callback in
    `window` make it painful since you need to `postMessage` back
    and forth
    … if we agree with `deviceId` and ordering, it seems cheap to
    support the `configurationchange`

    Tove: so you're supporting both approach?

    Youenn: I'm ok with it - but still unsure about the callback
    being async (need to discuss more the cropTarget use case)
    … I'm reluctant to adding a new state where frames aren't
    emitted outside of `track.enabled` - but would consider it if
    necessary

    Jan-Ivar: at the moment, screen capture tracks don't expose
    deviceIds
    … if we decide later today to expose it, this would require
    firing `configurationchange`
    … we're set to discuss it later on this agenda

    Youenn: `deviceId` would be a small addition that would help
    with detecting surface change - we will indeed discuss it later

    Tove: so if we conclude to adding it, we would go with both?

    Youenn: if we do need an async callback, having just the
    callback would be fine

    Jan-Ivar: I'm not fond of having both as they feel redundant

    Youenn: having both (if we go in the right order) might queue
    two or three tasks when dealing with multiple tracks, but the
    resulting delay shouldn't be meaningful
    … (this doesn't apply in the worker case since no
    synchronization is needed in that case)
    … is there agreement about firing the event before the callback
    or is that too early to decide?
    … today, you already get an event when switching from screen to
    window

    Tove: the spec isn't very clear on the circumstances of when
    the configurationchange event should fire

    Jan-Ivar: I think the algorithm refers explicitly capabilities
    and settings

     [27][screen-share] Issue 308: Should screen capture tracks expose
     deviceId?

      [27] https://github.com/w3c/mediacapture-screen-share/issues/308

    [28][Slide 33]

      [28] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#33

    Youenn: not huge use case, but low complexity and no specific
    drawback that I could identify

    jan-ivar: my main concern is that `deviceId` doesn't match how
    it is used in other specs

    Youenn: `deviceId` is identifying the source

    jan-ivar: the chapterizing use case would be better done with
    labels from my perspective; it makes assumptions about how one
    would go about this
    … I think that leaves configurationchange as the main use case

    Guido: the fact that `deviceId` doesn't change for a camera
    doesn't invalidate the idea of using for a possibly changing
    source
    … I like the idea of exposing it and signaling its change with
    configurationchange

    dom: exposing `deviceId` sounds reasonable to me to

    Jan-Ivar: ok, since I don't have a strong reason to object to
    it, I think we probably have consensus on adding it

    RESOLUTION: Consensus to add `deviceId` to settings of a track

    Youenn: I'll check how this impact to canvas sourced tracks
    (and how they behave today)

     [29]Back to Auto-pause

      [29] 
https://github.com/w3c/mediacapture-screen-share-extensions/issues/15

    Tove: so Youenn you asked about the async callback use case
    with regard to cropTarget?

    Youenn: is the use case that you're cropTargetting in a VC, you
    switch surface and then you want to pause both local/remote
    tracks before deciding whether to re-crop?

    Tove: imagine switching to a tab with slides and speaker notes,
    you'll want to cropTo to resolve before sending frames with the
    additional content

    Youenn: in that use case, there is no local preview? if so, why
    not use setting `track.enabled` to false or stopping it on the
    PC?

    Tove: there may be use cases where this would also apply to a
    local track - keeping the two tracks in sync as part of the
    callback feels cleaner

    Jan-Ivar: I'm a fan of simplicity - if we have the
    configurationchange event, I would prefer to only have the
    event API, not also the callback

    Guido: I think we can start with `deviceId` and if additional
    use cases show that having both would be beneficial, we can
    revisit it

    RESOLUTION: proceed with configurationchange event as the way
    to manage surface switching handling

     [30]Cross-type surface switching

      [30] 
https://github.com/w3c/mediacapture-screen-share-extensions/issues/16

    [31][Slide 20]

      [31] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#20

    Tove: this would be a hint more than a hardset requirement

    Jan-Ivar: what is the use case for not allowing surface
    switching, beyond backwards compat?

    Tove: it's only for backwards compat

    Jan-Ivar: if so, I would rather not add an API just for that

    Tove: if an app expects only a tab, the app may not expect they
    need to update their UI (e.g. cropTarget no longer is possible)

    Jan-Ivar: cropTo would fail, which is probably what you want

    Youenn: I would look at what OS do; in macOS, it's the OS
    controlling the UI to pick surfaces (not the UA)
    … I haven't checked if there is an option to do that on macOS -
    if it's not doable at the OS level, then it won't be
    implementable there, in which case I would be reluctant

    Tove: this is exposed on the macOS API

    Youenn: OK - if it's implementable, and if native apps find it
    useful, that suggests this is a useful API to consider
    … it would definitely need to be hint, and make it possible for
    the UA/user to override
    … I would use include as the default

    Tove: I'm hearing stronger examples needed for "exclude"

    Jan-Ivar: if it's a hint, maybe backwards compat doesn't need
    to be standardized

    Harald: we had a similar situation with the plan B transition

    Harald: so the conclusion is that cross-type surface switching
    is always on?

    Youenn: I'll look into the use cases that supported the macOS
    API and whether they justify a Web API

   [32]Timing info for Encoded Frames

      [32] https://github.com/w3c/webrtc-encoded-transform/issues/235

    [33][Slide 23]

      [33] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#23

    [34][Slide 24]

      [34] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#24

    [`captureTimestamp` in the slide should be `captureTime`]

    [35][Slide 25]

      [35] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#25

    Youenn: maybe we could use a single dictionary to have a single
    definition

    Guido: any opposition to add receiveTime to encoded frames?
    … receiveTime would be defined as the timestamp of the last
    packet necessary to build the video frame once received

    Jan-Ivar: are there cases where the said frame wouldn't be
    transmitted?
    … e.g. with WebCodecs + RTCEncodedSource?

    Guido: the time it reaches the receiver

    Harald: things that aren't received shouldn't have a
    receiveTime

    Bernard: this is a good proposal, and I like having it both in
    WebCodecs and in EncodedFrame

    Youenn: not sure I'm convinced with WebCodecs, but will discuss
    on github

    RESOLUTION: proceed with adding receiveTime to Encoded*
    Metadata

   [36][webrtc] Issue 3014: Spec says to send black frames for ended
   tracks

      [36] https://github.com/w3c/webrtc-pc/issues/3014

    [37][Slide 29]

      [37] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#29

    Youenn: I would align the spec with what implementations are
    doing; it may not be the best, but it's the current status and
    apps seem fine with it
    … it would be good to create consistency for the last point -
    maybe raise a separate issue

    RESOLUTION: Proceed with aligning with current implementations

   [38][mediacapture-extensions] Reduce scope of MediaStreamTrack
   transfer to DedicatedWorker for now

      [38] https://github.com/w3c/mediacapture-extensions/issues/158

    [39][Slide 30]

      [39] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#30

    Youenn: the VideoFrame object is only exposed in Window and
    DedicatedWorker, likewise for the MSTP
    … it makes sense to reduce this for DedicatedWorker
    … Safari is only implementing transfer to DedicatedWorker in
    the same cluster
    … (same for VideoFrame)
    … not sure if that needs to be spelt out

    Jan-Ivar: we could look into that and if there are precedents
    for restricting to a cluster

    RESOLUTION: Proceed with reducing scope of MediaStremaTrack to
    DedicatedWorker

   [40]Issue 115: What is the expected timing of MSTP video frame
   enqueuing with other track events

      [40] https://github.com/w3c/mediacapture-transform/issues/115

    [41][Slide 31]

      [41] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#31

    [42][Slide 32]

      [42] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#32

    Bernard: regarding the last one, I wouldn't try synchronizing
    getSettings and videoframe properties

    Youenn: that would require exposing all the settings in the
    video frame

    Bernard: maybe not all - e.g. facingMode wouldn't be useful;
    but some we're already discussing exposing (e.g. blur, width
    and height)

    Jan-Ivar: 2 questions here: timing, and what to expose in
    videoframe metadata
    … focusing on timing for now
    … it's good to specify things, but I'm not sure how many
    synchronous guarantees we can provide given the buffering
    … coupling things too tightly might make it challenging to add
    transform
    … the buffer may lead us to handle different e.g. the mute /
    unmute event

    Youenn: I'm talking specifically about enqueuing frames, not
    reading frames from the stream
    … because we're enqueuing a task, this provides an opportunity
    to synchronize things
    … if we specify unmute, it seems logical to specify the mute
    situation as well, symetrically
    … I agree with Bernard's point about getSettings; I'm not sure
    how to handle applyConstraints or configurationChange

    Jan-Ivar: how would handle buffers, e.g. maxBufferSize = 10?

    Youenn: this shouldn't have any impact

    Guido: this should be dealt at the sink level (where buffering
    happens)
    … not sure it needs to have something specific for MSTP - it
    mostly deals with things outside the track

    Youenn: one model we could consider: mute/unmute events are
    always sent in the same thread; we enqueue tasks for the unmute
    event
    … for the rest, we consider sources single-threaded

    Harald: VideoFrames are observable on an HTML video element
    … if we get a mute event, and we get a frame that was sent
    after the mute event, that's clearly an error - we should
    prevent it
    … we can say something like the events have to act as if they
    were ordered, as if they came from the same event source

    Jan-Ivar: I would like to understand better as to why this is
    problem
    … focusing on the event might the wrong approach
    … e.g. muting happens in the background, and is only surfaced
    in JS later
    … you could probably write a WPT with VideoTrackGenerator
    … my only worry is to make things too synchronous
    … I'm not sure there is an overall solution, we should look at
    each case

    Youenn: agree about not dealing with getSettings and
    videoframes
    … I can try and detail more situations on github

    Jan-Ivar: shouldn't it apply to track.enabled as well?

    Youenn: I should look into it

    Bernard: re configurationchange and applyConstraints - are we
    clear on the effect of these events on videoframe properties?
    … otherwise, I agree with Jan-Ivar on the risk of being too
    prescriptive
    … we shouldn't make this based on the timing of the events, but
    instead base on videoframe properties

    Youenn: ok, so more discussion is needed on github

   [43][mediacapture-output] Issue 147: Implicit consent via getUserMedia
   should allow access to non-miked speakers

      [43] https://github.com/w3c/mediacapture-output/issues/147

    [44][Slide 35]

      [44] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#35

    [45][Slide 36]

      [45] 
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#36

    Jan-Ivar: my concern is that this might be blocking on [46]w3c/
    mediacapture-main#1019
    … not sure we should proceed with this before we get clarity on
    that one
    … if we maintain the existing spec requirement, this might make
    sense

      [46] https://github.com/w3c/mediacapture-main/issues/1019

    Youenn: we could disentable the two by noting you need
    microphone access for exposing speakers (but we should make
    progress on the other one)

    Guido: [47]w3c/mediacapture-main#1019 is orthogonal to this one
    since we're already exposing speakers - the discussion is which
    speakers we expose
    … when is enumeration allowed is orthogonal

      [47] https://github.com/w3c/mediacapture-main/issues/1019

    Youenn: does Chrome expose speakers without capture?

    Guido: it gates enumeration on permission

    Youenn: would you be able to align with the spec for speakers?

    Guido: the problem is Web compatibility
    … implementing the spec was not web compatible, so we had to
    rollback hence [48]w3c/mediacapture-main#1019

      [48] https://github.com/w3c/mediacapture-main/issues/1019

    Jan-Ivar: the 2 situations are connected since the decision on
    this may lead to different implementation across browsers

    Bernard: I agree with Guido they're orthogonal
    … would like to support exposing all speakers

    Jan-Ivar: I would object with proceeding before [49]w3c/
    mediacapture-main#1019 is resolved

      [49] https://github.com/w3c/mediacapture-main/issues/1019

    Youenn: let's try to discuss it at the next meeting then

Summary of resolutions

     1. [50]Consensus to add `deviceId` to settings of a track
     2. [51]proceed with configurationchange event as the way to
        manage surface switching handling
     3. [52]proceed with adding receiveTime to Encoded* Metadata
     4. [53]Proceed with aligning with current implementations
     5. [54]Proceed with reducing scope of MediaStremaTrack to
        DedicatedWorker
Received on Wednesday, 11 December 2024 07:52:54 UTC