- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Wed, 11 Dec 2024 08:52:52 +0100
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi,
The minutes of our meeting held yesterday are available at:
https://www.w3.org/2024/12/10-webrtc-minutes.html
and copied as text below.
Dom
WebRTC December 2024 meeting
10 December 2024
[2]Agenda. [3]IRC log.
[2] https://www.w3.org/2011/04/webrtc/wiki/December_10_2024
[3] https://www.w3.org/2024/12/10-webrtc-irc
Attendees
Present
Bernard, Carine, Guido, Harald, Jan-Ivar, Peter,
SunShin, TimP, Tove, Youenn
Regrets
-
Chair
Bernard, Guido, Jan-Ivar
Scribe
dom
Contents
1. [4]Captured Surface Switching
1. [5]Auto-pause
2. [6][screen-share] Issue 308: Should screen capture
tracks expose deviceId?
3. [7]Back to Auto-pause
4. [8]Cross-type surface switching
2. [9]Timing info for Encoded Frames
3. [10][webrtc] Issue 3014: Spec says to send black frames for
ended tracks
4. [11][mediacapture-extensions] Reduce scope of
MediaStreamTrack transfer to DedicatedWorker for now
5. [12]Issue 115: What is the expected timing of MSTP video
frame enqueuing with other track events
6. [13][mediacapture-output] Issue 147: Implicit consent via
getUserMedia should allow access to non-miked speakers
7. [14]Summary of resolutions
Meeting minutes
Slideset: [15]https://docs.google.com/presentation/d/
1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/
edit#slide=id.g2bb12bc23cb_0_0 ([16]archived PDF copy)
[15]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0
Captured Surface Switching
[17][Slide 10]
[17]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#10
[18]Auto-pause
[18]
https://github.com/w3c/mediacapture-screen-share-extensions/issues/15
[19][Slide 12]
[19]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#12
[20][Slide 13]
[20]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#13
[21][Slide 14]
[21]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#14
[22][Slide 15]
[22]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#15
[23][Slide 16]
[23]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#16
[24][Slide 17]
[24]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#17
[25][Slide 18]
[25]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#18
[26][Slide 19]
[26]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#19
Youenn: re slide 17, let's take an example of a Web app wanting
to pause to ask the user to confirm its intent to share the
content
… you want to pause sending to remote parties, but want to keep
the local view of the shared content
… you need different behaviors for the two
Tove: with the callback, you can do the same thing as with the
event
Youenn: but that shows that freezing the video would be
hack-ish in both situations
… in a PC, you can use setParameter or replaceTrack() to handle
this
… if you're in a call with a local preview with a track in the
worker sent to the remote party
… you want to stop sending the track in the worker
… the guarantee you get from the event makes it a better API;
you may not even need to pause the track
… I prefer the event, would be OK with adding the callback
although not sure about having it async, although maybe the
cropTarget makes it compelling
… but would want to understand it better
… otherwise would prefer a sync callback
… in terms of ordering, it would be better to first fire the
event (ensuring the settings are updated) and then the callback
can access accurate information
Jan-Ivar: the UI use case can be solved with either API; the
only need for async would be for replaceTrack or cropTo
… configurationchange exists today, mostly to adapt consumer of
a track to its properties, with throttling to avoid correlating
across origins
… proposal 1 feels simpler
… also deviceId is not currently specified at the moment, so
wouldn't be exposed via configurationchange at the moment
… also the event may not be clear about their causes which
might create unexpected bugs for developers who wouldn't handle
all situations
Tove: the configurationchange event is only for a few cases
today
Youenn: the event should be fired whenever the settings are
changed without involvement from the application
… e.g. when the user agent or the OS enables blur - the app
should be able to keep track of that
… this matches what happens when a surface changes (outside of
the Web app control)
… re fuzzing, I don't think it's useful in this context - we
should remove the related note
… when you're processing in the worker, having the callback in
`window` make it painful since you need to `postMessage` back
and forth
… if we agree with `deviceId` and ordering, it seems cheap to
support the `configurationchange`
Tove: so you're supporting both approach?
Youenn: I'm ok with it - but still unsure about the callback
being async (need to discuss more the cropTarget use case)
… I'm reluctant to adding a new state where frames aren't
emitted outside of `track.enabled` - but would consider it if
necessary
Jan-Ivar: at the moment, screen capture tracks don't expose
deviceIds
… if we decide later today to expose it, this would require
firing `configurationchange`
… we're set to discuss it later on this agenda
Youenn: `deviceId` would be a small addition that would help
with detecting surface change - we will indeed discuss it later
Tove: so if we conclude to adding it, we would go with both?
Youenn: if we do need an async callback, having just the
callback would be fine
Jan-Ivar: I'm not fond of having both as they feel redundant
Youenn: having both (if we go in the right order) might queue
two or three tasks when dealing with multiple tracks, but the
resulting delay shouldn't be meaningful
… (this doesn't apply in the worker case since no
synchronization is needed in that case)
… is there agreement about firing the event before the callback
or is that too early to decide?
… today, you already get an event when switching from screen to
window
Tove: the spec isn't very clear on the circumstances of when
the configurationchange event should fire
Jan-Ivar: I think the algorithm refers explicitly capabilities
and settings
[27][screen-share] Issue 308: Should screen capture tracks expose
deviceId?
[27] https://github.com/w3c/mediacapture-screen-share/issues/308
[28][Slide 33]
[28]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#33
Youenn: not huge use case, but low complexity and no specific
drawback that I could identify
jan-ivar: my main concern is that `deviceId` doesn't match how
it is used in other specs
Youenn: `deviceId` is identifying the source
jan-ivar: the chapterizing use case would be better done with
labels from my perspective; it makes assumptions about how one
would go about this
… I think that leaves configurationchange as the main use case
Guido: the fact that `deviceId` doesn't change for a camera
doesn't invalidate the idea of using for a possibly changing
source
… I like the idea of exposing it and signaling its change with
configurationchange
dom: exposing `deviceId` sounds reasonable to me to
Jan-Ivar: ok, since I don't have a strong reason to object to
it, I think we probably have consensus on adding it
RESOLUTION: Consensus to add `deviceId` to settings of a track
Youenn: I'll check how this impact to canvas sourced tracks
(and how they behave today)
[29]Back to Auto-pause
[29]
https://github.com/w3c/mediacapture-screen-share-extensions/issues/15
Tove: so Youenn you asked about the async callback use case
with regard to cropTarget?
Youenn: is the use case that you're cropTargetting in a VC, you
switch surface and then you want to pause both local/remote
tracks before deciding whether to re-crop?
Tove: imagine switching to a tab with slides and speaker notes,
you'll want to cropTo to resolve before sending frames with the
additional content
Youenn: in that use case, there is no local preview? if so, why
not use setting `track.enabled` to false or stopping it on the
PC?
Tove: there may be use cases where this would also apply to a
local track - keeping the two tracks in sync as part of the
callback feels cleaner
Jan-Ivar: I'm a fan of simplicity - if we have the
configurationchange event, I would prefer to only have the
event API, not also the callback
Guido: I think we can start with `deviceId` and if additional
use cases show that having both would be beneficial, we can
revisit it
RESOLUTION: proceed with configurationchange event as the way
to manage surface switching handling
[30]Cross-type surface switching
[30]
https://github.com/w3c/mediacapture-screen-share-extensions/issues/16
[31][Slide 20]
[31]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#20
Tove: this would be a hint more than a hardset requirement
Jan-Ivar: what is the use case for not allowing surface
switching, beyond backwards compat?
Tove: it's only for backwards compat
Jan-Ivar: if so, I would rather not add an API just for that
Tove: if an app expects only a tab, the app may not expect they
need to update their UI (e.g. cropTarget no longer is possible)
Jan-Ivar: cropTo would fail, which is probably what you want
Youenn: I would look at what OS do; in macOS, it's the OS
controlling the UI to pick surfaces (not the UA)
… I haven't checked if there is an option to do that on macOS -
if it's not doable at the OS level, then it won't be
implementable there, in which case I would be reluctant
Tove: this is exposed on the macOS API
Youenn: OK - if it's implementable, and if native apps find it
useful, that suggests this is a useful API to consider
… it would definitely need to be hint, and make it possible for
the UA/user to override
… I would use include as the default
Tove: I'm hearing stronger examples needed for "exclude"
Jan-Ivar: if it's a hint, maybe backwards compat doesn't need
to be standardized
Harald: we had a similar situation with the plan B transition
Harald: so the conclusion is that cross-type surface switching
is always on?
Youenn: I'll look into the use cases that supported the macOS
API and whether they justify a Web API
[32]Timing info for Encoded Frames
[32] https://github.com/w3c/webrtc-encoded-transform/issues/235
[33][Slide 23]
[33]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#23
[34][Slide 24]
[34]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#24
[`captureTimestamp` in the slide should be `captureTime`]
[35][Slide 25]
[35]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#25
Youenn: maybe we could use a single dictionary to have a single
definition
Guido: any opposition to add receiveTime to encoded frames?
… receiveTime would be defined as the timestamp of the last
packet necessary to build the video frame once received
Jan-Ivar: are there cases where the said frame wouldn't be
transmitted?
… e.g. with WebCodecs + RTCEncodedSource?
Guido: the time it reaches the receiver
Harald: things that aren't received shouldn't have a
receiveTime
Bernard: this is a good proposal, and I like having it both in
WebCodecs and in EncodedFrame
Youenn: not sure I'm convinced with WebCodecs, but will discuss
on github
RESOLUTION: proceed with adding receiveTime to Encoded*
Metadata
[36][webrtc] Issue 3014: Spec says to send black frames for ended
tracks
[36] https://github.com/w3c/webrtc-pc/issues/3014
[37][Slide 29]
[37]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#29
Youenn: I would align the spec with what implementations are
doing; it may not be the best, but it's the current status and
apps seem fine with it
… it would be good to create consistency for the last point -
maybe raise a separate issue
RESOLUTION: Proceed with aligning with current implementations
[38][mediacapture-extensions] Reduce scope of MediaStreamTrack
transfer to DedicatedWorker for now
[38] https://github.com/w3c/mediacapture-extensions/issues/158
[39][Slide 30]
[39]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#30
Youenn: the VideoFrame object is only exposed in Window and
DedicatedWorker, likewise for the MSTP
… it makes sense to reduce this for DedicatedWorker
… Safari is only implementing transfer to DedicatedWorker in
the same cluster
… (same for VideoFrame)
… not sure if that needs to be spelt out
Jan-Ivar: we could look into that and if there are precedents
for restricting to a cluster
RESOLUTION: Proceed with reducing scope of MediaStremaTrack to
DedicatedWorker
[40]Issue 115: What is the expected timing of MSTP video frame
enqueuing with other track events
[40] https://github.com/w3c/mediacapture-transform/issues/115
[41][Slide 31]
[41]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#31
[42][Slide 32]
[42]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#32
Bernard: regarding the last one, I wouldn't try synchronizing
getSettings and videoframe properties
Youenn: that would require exposing all the settings in the
video frame
Bernard: maybe not all - e.g. facingMode wouldn't be useful;
but some we're already discussing exposing (e.g. blur, width
and height)
Jan-Ivar: 2 questions here: timing, and what to expose in
videoframe metadata
… focusing on timing for now
… it's good to specify things, but I'm not sure how many
synchronous guarantees we can provide given the buffering
… coupling things too tightly might make it challenging to add
transform
… the buffer may lead us to handle different e.g. the mute /
unmute event
Youenn: I'm talking specifically about enqueuing frames, not
reading frames from the stream
… because we're enqueuing a task, this provides an opportunity
to synchronize things
… if we specify unmute, it seems logical to specify the mute
situation as well, symetrically
… I agree with Bernard's point about getSettings; I'm not sure
how to handle applyConstraints or configurationChange
Jan-Ivar: how would handle buffers, e.g. maxBufferSize = 10?
Youenn: this shouldn't have any impact
Guido: this should be dealt at the sink level (where buffering
happens)
… not sure it needs to have something specific for MSTP - it
mostly deals with things outside the track
Youenn: one model we could consider: mute/unmute events are
always sent in the same thread; we enqueue tasks for the unmute
event
… for the rest, we consider sources single-threaded
Harald: VideoFrames are observable on an HTML video element
… if we get a mute event, and we get a frame that was sent
after the mute event, that's clearly an error - we should
prevent it
… we can say something like the events have to act as if they
were ordered, as if they came from the same event source
Jan-Ivar: I would like to understand better as to why this is
problem
… focusing on the event might the wrong approach
… e.g. muting happens in the background, and is only surfaced
in JS later
… you could probably write a WPT with VideoTrackGenerator
… my only worry is to make things too synchronous
… I'm not sure there is an overall solution, we should look at
each case
Youenn: agree about not dealing with getSettings and
videoframes
… I can try and detail more situations on github
Jan-Ivar: shouldn't it apply to track.enabled as well?
Youenn: I should look into it
Bernard: re configurationchange and applyConstraints - are we
clear on the effect of these events on videoframe properties?
… otherwise, I agree with Jan-Ivar on the risk of being too
prescriptive
… we shouldn't make this based on the timing of the events, but
instead base on videoframe properties
Youenn: ok, so more discussion is needed on github
[43][mediacapture-output] Issue 147: Implicit consent via getUserMedia
should allow access to non-miked speakers
[43] https://github.com/w3c/mediacapture-output/issues/147
[44][Slide 35]
[44]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#35
[45][Slide 36]
[45]
https://docs.google.com/presentation/d/1wk-0WiD1aTqcQUYDiNJcP-XQ421KCbY6A1xX_-MKy_A/edit#slide=id.g2bb12bc23cb_0_0#36
Jan-Ivar: my concern is that this might be blocking on [46]w3c/
mediacapture-main#1019
… not sure we should proceed with this before we get clarity on
that one
… if we maintain the existing spec requirement, this might make
sense
[46] https://github.com/w3c/mediacapture-main/issues/1019
Youenn: we could disentable the two by noting you need
microphone access for exposing speakers (but we should make
progress on the other one)
Guido: [47]w3c/mediacapture-main#1019 is orthogonal to this one
since we're already exposing speakers - the discussion is which
speakers we expose
… when is enumeration allowed is orthogonal
[47] https://github.com/w3c/mediacapture-main/issues/1019
Youenn: does Chrome expose speakers without capture?
Guido: it gates enumeration on permission
Youenn: would you be able to align with the spec for speakers?
Guido: the problem is Web compatibility
… implementing the spec was not web compatible, so we had to
rollback hence [48]w3c/mediacapture-main#1019
[48] https://github.com/w3c/mediacapture-main/issues/1019
Jan-Ivar: the 2 situations are connected since the decision on
this may lead to different implementation across browsers
Bernard: I agree with Guido they're orthogonal
… would like to support exposing all speakers
Jan-Ivar: I would object with proceeding before [49]w3c/
mediacapture-main#1019 is resolved
[49] https://github.com/w3c/mediacapture-main/issues/1019
Youenn: let's try to discuss it at the next meeting then
Summary of resolutions
1. [50]Consensus to add `deviceId` to settings of a track
2. [51]proceed with configurationchange event as the way to
manage surface switching handling
3. [52]proceed with adding receiveTime to Encoded* Metadata
4. [53]Proceed with aligning with current implementations
5. [54]Proceed with reducing scope of MediaStremaTrack to
DedicatedWorker
Received on Wednesday, 11 December 2024 07:52:54 UTC