- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Fri, 18 Mar 2022 07:38:51 +0100
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi,
The minutes of our meeting last Tuesday (March 15) are available at:
https://www.w3.org/2022/03/15-webrtc-minutes.html
including the YouTube recording at https://youtu.be/GM56xH-jF8Q
They're also copied as text below.
Dom
WebRTC WG March 2022 call
15 March 2022
[2]Agenda. [3]IRC log.
[2] https://www.w3.org/2011/04/webrtc/wiki/March_15_2022
[3] https://www.w3.org/2022/03/15-webrtc-irc
Attendees
Present
BenWagner, Bernard, Dom, Eero, Elad, Guido, Harald,
Jan-Ivar, JohannesKron, Riju, Tuukka, Varun, Youenn
Regrets
caribou
Chair
Bernard, Harald, Jan-Ivar
Scribe
dom
Contents
1. [4]TPAC 2022
2. [5]WebRTC-SVC
3. [6]WebRTC-Extensions
4. [7]Avoiding the “Hall of Mirrors”
5. [8]Display Surface Hints
6. [9]getViewportMedia update
7. [10]MediaCapture Extensions proposals
8. [11]Summary of resolutions
Meeting minutes
Recording: [12]https://youtu.be/GM56xH-jF8Q
[12] https://youtu.be/GM56xH-jF8Q
IFRAME:
[13]https://www.youtube.com/embed/GM56xH-jF8Q?enablejsapi=1&rel
=0&modestbranding=1
[13]
https://www.youtube.com/embed/GM56xH-jF8Q?enablejsapi=1&rel=0&modestbranding=1
Slideset: [14]https://lists.w3.org/Archives/Public/www-archive/
2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf
[14]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf
[15][Slide 1]
[15]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=1
[16][Slide 3]
[16]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=3
TPAC 2022 [17]🎞︎
[17] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=164
[18][Slide 8]
[18]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=8
Dom: TPAC being considered as a hybrid event this year - please
indicate whether you think you might join physically such an
event?
[from online poll: 3 Yes, 4 No, 4 don't know]
[19]WebRTC-SVC [20]🎞︎
[19] https://github.com/w3c/webrtc-svc/
[20] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=362
[21][Slide 11]
[21]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=11
Bernard: [22]issue #68 relates to behavior of getParameters() -
unclear about re-negotiation (vs before/after negotiation)
… [23]PR #69 has proposed text that clarifies that we're
talking about **initial** negotiation (before/after)
… if you re-negotiate, you'll still get the currently
configured scalability mode
[22] https://github.com/w3c/webrtc-svc/issues/68
[23] https://github.com/w3c/webrtc-svc/pull/69
Harald: wfm
Jan-Ivar: is this correct? getParameters() algos are very
explicit about what you get based e.g. on localDescription
… some come from pending, others from current
Bernard: let's say you change preference order for codecs, and
you renegotiate (e.g. from VP8 with L1T2 to H264 that doesn't
support scalability) - what happens then?
… at what point do things change?
JIB: even without setCodecPreferences, getParameters() may
return different values depending on whether re-negotiation is
happening or not
… e.g. if you have a local offer, it might affect the results
Bernard: looking at the VP8→H264 case, what should happen?
HTA: as long as you're sending VP8, you should get L1T2 back
… when you switch to H264, you get L1T1 back
Bernard: that's what I would expect and what the text tries to
convey
… nothing changes until the new codec starts being used
… JIB, could you write up your concern in [24]#68 ?
[24] https://github.com/w3c/webrtc-svc/issues/68
RESOLUTION: Continue discussion in [25]issue #68
[25] https://github.com/w3c/webrtc-svc/issues/68
[26]WebRTC-Extensions [27]🎞︎
[26] https://github.com/w3c/webrtc-extensions/
[27] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=791
[28][Slide 16]
[28]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=16
Bernard: Fippo gathered a list of hardware acceleration bugs
that has been encountered
… which raises the question of allowing to disable hardware
acceleration
… WebCodecs provides an enum to hint about whether or not use
hardware acceleration
[29][Slide 17]
[29]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=17
Bernard: I looked into 2 approaches: setParameters,
setCodecPreferences
… the first one doesn't really work since the envelope of
changes may not include hardware alternatives
… it also only makes sense if mid-stream switch is necessary
… the second approach goes through re-negotiation via
setCodecPreferences()
… How would you discover this?
… Media capabilities may need amendment [30]https://github.com/
w3c/media-capabilities/issues/185
[30] https://github.com/w3c/media-capabilities/issues/185
Dom: should this be managed by the browser rather than left for
developers to detect and manage?
Bernard: this would be useful *when* developers detect a
problem so that they don't need to wait for browsers to react
to it
Florent: there are also cases where a decoder interacts badly
with a specific encoder
JIB: for setParameters, there are read-only properties
… putting it in codeccapability (which is returned to
developers) means doubling the number of entries
Bernard: you may not have to return it from Capabilitiy
JIB: but then it doesn't fit very well with a notion of codec
preference
… we've also moved fingerprinting surface to media capabilities
… I wouldn't want to reintroduce concerns without good reasons
… it doesn't seem necessary to include that info if it is
tackled as a preference
Johannes: I understand this as developer wanting to disable
hardware encoding as a short-term patch to the browser getting
it fixed
… it sounds like a recovery mode, more than a capability
… also agree it's hard for developers to use it, but that it
would have its uses
Harald: routing around bugs is for specific implementations of
the codec, which requires they know the specific implementation
… does that point toward media capability as the right way to
go?
Bernard: that's where you'd find out if it's "smooth", "power
efficient", "supported"
Harald: if it's X's hardware encoder with software version Y,
that may be the information you need to know whether or not to
use it
… not sure that fits with the Media Capabilities model
Johannes: it would seem challenging
… Also, the bugs that have been identified seem to be
browser-specific
… there are block-lists for this or that hardware; it may be
worth investigate the possibility to move towards dynamic
blocklists from browsers
Riju: we share the GPU blocklist defined in Chrome with our
driver team to get them to be fixed platform by platfomr
Harald: no clear resolution, but some suggested paths worth
exploring
[31][Slide 18]
[31]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=18
Harald: [32]issue #99 about RTP header extension
… if an implementation supports an extension, it doesn't show
up in Capabilities at the moment
… is this problematic? if not, no change needed; if it is, we
may need to surface that it exists but is disabled by default
… you can get the information by inspecting the offer, so this
may not be needed
[32] https://github.com/w3c/webrtc-extensions/issues/99
Bernard: it's a convenience in the use case; there will be
scenarios where you don't want to set it on by default
Dom: is anyone asking for it?
JIB: if this is for debugging, looking at the SDP is fine; if
it's to control running code, it should be an API
Harald: the most likely example would be if transport-cc is not
supported, I fallback to another congestion control
… I think it can be shimmed by creating an offer and dancing
with a throw-away peer connection
Dom: not hearing a lot pushback, nor a lot of demand either;
maybe wait until we have more demand if it can be designed in a
way that is backwards compatible
Harald: yes, it can be done later in a backwards compatible
RESOLUTION: close [33]#99 with no change
[33] https://github.com/w3c/webrtc-extensions/issues/99
[34]Avoiding the “Hall of Mirrors” [35]🎞︎
[34] https://github.com/w3c/mediacapture-screen-share/issues/209
[35] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=1970
[36][Slide 21]
[36]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=21
[37][Slide 22]
[37]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=22
[38][Slide 23]
[38]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=23
[39][Slide 24]
[39]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=24
Elad: the proposal would to add a new member to the
DisplayMediaStreamContraints à la includeCurrentTab to hint to
the UA whether or not to include the current tab or not
[40][Slide 25]
[40]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=25
Elad: influencing the user decision in picking display surfaces
has security implications
… but I argue that in this case, it is not problematic: the
risks of selection are of two nature:
… - the attacker influence the user to share a surface under
the attacker's control
… - the attacker influences the user to share a tab with
sensitive content (e.g. their bank account)
… but excluding-self is orthogonal to these
[41][Slide 26]
[41]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=26
Elad: if we agree this is worth solving; the question becomes
what's the default value should be
… if we make it optional, this could be left as a UA dependent
default
[42][Slide 27]
[42]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=27
Elad: a potential expansion would cover additional surfaces
(e.g. screen)
JIB: [43]#209 has the detailed discussion - what is the
proposal we're reviewing?
[43] https://github.com/w3c/mediacapture-screen-share/issues/209
Elad: I suggest adding a dictionary member (either include or
exclude) that serves as a hint, with no change to current
behavior
JIB: I like this API, but would want the default to be "false"
… I don't think this is so much about hall of mirrors - a
symptom that the UA could address either ways
… the real issue is that in many cases, self-capture is NOT the
intent
… long term, self-capture would be getViewportMedia
… some sites that want self-capture to be part of the selection
- they would need to opt-in
… also, TAG guidance is that undefined maps to false
Elad: re default true - agree
… re alternative approaches Youenn suggest, I don't think ti
works for current tab (it would work for current screen)
… I agree with your characterization that the root cause is if
you're not ready to self capture
… I suggest we don't take getViewportMedia into account since
there is little visibility in terms of its adoption
… I think we should avoid breaking apps, even if shortly
JIB: I think we should keep that separate from what
implementations do
… here the question is what's the most frequent case, most
sites wouldn't want to it
Elad: lost of self-capture happning every year; assume a lot of
it not accidental
Youenn: re security, the current spec doesn't deal much with
tab capture in that regard
… we're bringing more and more control to what UAs will show,
and that means we need to strengthen the guidance to UAs
… Chrome has some mitigations in this space that might serve as
a starting point
… If this is a hint, this is fine
… Some implementations might remove entirely the possibility to
select the tab, that's something new
… hints allow to push users towards the more meaningful choice,
but leave the user in charge of the final choice
… re hall of mirrors - I don't think this is solving it
… some native apps have implemented current-app blurring to
solving the issue
… cropping would be another way to solve the issue
… if it's only a hint, it's fine; but if it brings a required
behavior, I don't think we should go there
… also want more security guidance
… and keep issue open on addressing other aspects of hall of
mirrors
Elad: could you help with the security guidance?
Youenn: Ideally would like to get the work that Chrome has done
Dom: +1 on a hint; if boolean is problematic, we can use an
enum to avoid the default value fallback
Elad: happy to help with getting the security considerations
with guidance from Youenn on what he wants to see
Harald: hearing overall support to continue in that direction,
towards a hint
[44]Display Surface Hints [45]🎞︎
[44] https://github.com/w3c/mediacapture-screen-share/issues/184
[45] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=3236
[46][Slide 30]
[46]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=30
Elad: similar to previous issue, but distinct
… some apps want to hint to the UA that it is will geared
toward a particular display surface type
… I think there is agreement that this is worth supporting
… but we've struggled to find an approach that everyone likes
… I'm suggesting a compromise based on the discussion which
would be:
… - use constraints as a mechanism
… - make it a hint with UA dependent behavior
Youenn: hint is fine; it could be a constraint as a model, but
with an improved simpler WebIDL surface
Elad: reject on "exact"?
Youenn: "exact" would be ignored
Harald: -1 in integrating this in the proposal - I hate
irregularities
JIB: +1 to Harald; "exact" is already a type error in
getDisplayMedia which already narrows down the constraint
mechanism
… agree with reusing displaySurface
… I have concerns with an app asking for a monitor - I don't
think we should provide this level of control
… I proposed text to steer away users from monitor capture
Elad: this is a hint - UAs can decide not to follow it
Dom: with a hint, UAs can provide the best experience they can
… not sure the SHOULD would achieve much if the main target
isn't interested in SHOULD
Youenn: the SHOULd owuld be useful for new implementors
Elad: there is merit to that
… non-normative language pointing to the risk would be good
JIB: the SHOULD already allows for this; given Chrome has a
good motivation, this feels like an exact reason why SHOULD
would be used
RESOLUTION: modulo discussion on SHOULD guidance, we adopt the
displaySurface constraint proposal to manage Surface Hints
[47]getViewportMedia update [48]🎞︎
[47] https://github.com/w3c/mediacapture-viewport
[48] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=3906
[49][Slide 31]
[49]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=31
JIB: FYI, there is a PR up to describe getViewportMedia which
hopes to bring to a call for adoption soon
[50]Viewport Capture Unofficial Draft
[50] https://w3c.github.io/mediacapture-viewport/
Youenn: we probably need a different set of constraints than
the ones for getDisplayMedia
… re audio, we need to think about whether to include system
level audio or just current tab
JIB: currently restricted to current tab
Harald: if it can't be isolated, no audio should be captured
JIB: there are pending PRs that I hope will be merged before we
start the call for adoption
Elad: the general intent of this work is awesome; looking
forward to see it implemented
… that said, until we see it adopted, we need to be careful in
basing our decisions on this work, or consider relaxing some of
the restrictions
Youenn: has there been any outreach to web developers re
x-origin isolation?
Elad: the feedback I got from developers was this was a blocker
for them
Bernard: ditto
JIB: I agree this is taking the long view here
… hence the flexibility we're showing on getDisplayMedia
… re using different constraints, we can change it when it
shows as needed
Youenn: displaySurface would be one case where this is needed
[51]MediaCapture Extensions proposals [52]🎞︎
[51] https://github.com/w3c/mediacapture-extensions/
[52] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=4238
[53][Slide 34]
[53]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=34
Riju: this is follow up from a conversation that started at
TPAC
[54][Slide 35]
[54]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=35
Riju: [55]PR #48 is allowing in-browser face detection
… when we showed this last time, the feedback included:
… - tie it to VideoFrame rather than MediaStreamTrack, which
the PR reflects
… - future-proofing the bounding box approach - this is
addressed with the Contour described in the PR, with a way for
the developer to request something other than the default 4
… - another request was to have a face mesh - which is now
exposed as an additional property (although there is no native
support for it today)
… - face expression was raised as a concern, so we removed it
… - making face detection work with transform stream
[55] https://github.com/w3c/mediacapture-extensions/pull/48
[56][Slide 36]
[56]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=36
Riju: we've put up an example to show how they would work
together
… we've done early testing that shows improved power
consumption - more specific numbers to be shared soon
Youenn: good to expose it on VideoFrame; but would also be good
to expose in requestVideoFrame callback e.g. for use with
canvas
… re using "exact" constraints - I would expect "exact" not to
be allowed in this
… There seems to be switches to give hints to cameras - do we
need several switches to allow per-algo enabling, or could we
have a single "face detection" switch?
Riju: e.g. "is face detection supported"?
Youenn: why multiple switches if a single one is good enough,
leaving it to the Web app to deal with what they're obtaining
Riju: for instance, contour points would allow future support
for additional more detailed contours
Youenn: since the camera is doing the work, not clear we need
to give more hints to the driver
Riju: contour/mesh were added for extensibility
Youenn: maybe reduce to what's implementable, while
future-proofing it
Bernard: high level questions about the API surface
… I understand the supported contraints & capabilities are used
to provide the basic parameters for the algorithm in the driver
… videoFrame.detectedFaces is already done by the driver
… as opposed to have a promise-based method to which the
parameters would be given
… if your camera driver doesn't support it, you wouldn't have
it
Riju: going through promises, this would impact performance and
re do work the driver has already done
… OS level face analysis would duplicate computation already
done in the driver
JIB: so, it's a camera API - only available to sources that are
camera?
Riju: right
JIB: my concern is that there is another effort in the WICG,
the shape detection API - how does it relate to it?
… would be unfortunate to have it to deal with face detection
differently depending on the source
Riju: shape detection work on images, can be called multiple
time
… no face tracking available, which helps detecting face across
frames efficiently
… face detection is based on OS level face analysis, which
duplicates the driver work and is less power efficient / robust
… we started from that API in our effort in this space - we
feel this new approach gives much better results
… FaceDetector is only supported in Windows atm; the work has
stopped afaict
Bernard: so you're saying the WICG work is not going ahead?
Riju: I can check the status with Reilly (but my team was the
one behind the implementation)
Harald: I share some of JIB's worries
… we have functions today that depend on high quality face
detection e.g. background blur
… I'm worried about having these different interfaces to solve
the same problem
… esp if some interfaces end up proprietary
… if the proprietary interfaces provide much higher quality
than what standard interfaces can provide
… hence my pushback on making contours and meshes available in
the API
… I'm still not happy with the design that seems to be totally
focused on axing this on hardware/driver resources rather than
a representation API
… it has a bit of that flavor, but there is still a lot of a
sense of configuring the camera
… also I'm surprised this only gives a 50% factor over media
pipe
… but in general, this feels like a major new way of treating
media information
… I'd like to see be proposed as a proposal, not as a set of
API patches
… with an explainer, use cases, examples - that we typically
put together before agree on taking it up
Riju: no need to configure the driver
… the PR includes examples
Harald: I'm thinking of what application would be use this for,
what problems to solve
Dom: what an explainer would cover
Riju: I can come up with that
Dom: happy to help with the logistics of making it happen
Riju: is the question about whether this is useful or not?
harald: yes
bernard: or rather whether it handles all the use cases people
want
Jan-Ivar: e.g. tying this with camera may become obsolete or
too limiting
… having an API that isn't as strongly tied to hardware
acceleration
Harald: I'd like to have a better understanding of which apps
want a rectangle around a face
Youenn: encoders actually optimize around faces if such
metadata are available
… +1 on defining API that can obtain metadata from the hardware
or a TransformStream
JIB: among other things, having less hardware-dependency allows
UAs to step in
[57][Slide 37]
[57]
https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=37
Riju: backgroundBlur has more platform API support than
replacement
Youenn: iOS has the ability to switch on & off background blur,
fully outside of the Web app, and fully dynamic
… the Web app could not unblur if the user has set this us at
the OS level
… (but not vice versa)
… that situation is not well supported by constraints
… we may need a way to surface whether a constraint *can* be
changed (and to signal when it can no longer be changed)
JIB: this is a case where constraints work very well - the app
states its ideal
… background blur is popular, would be good to support it
Youenn: I don't think "ideal" suffices to expose the situation
… re backgroundBlur level - it's not settable on iOS; are there
platforms that would benefit from it?
Riju: no platform API supports this, but some software models
have that parameters
… but I understand some platforms are working towards making it
settable
Youenn: but without knowing the algorithm, setting a particular
value would be hard for developers
… we may need a boolean instead
JIB: part of the question is whether this needs to be
controllable by apps vs the UA
harald: in audio, we've encountered cases that it's valuable to
tell have manipulating settings that are supposed to be useful
in the driver, but actually creates issues
… e.g. double echo cancellation control
… the most important control we have is to turn platform
effects off; the second was to detect the situation to ask the
user to turn it off
Riju: on the last three proposals (lighting correct, face
framing, eye gaze correction), any sense of interest?
… the goal is to give options to developers on whether or not
to use hardware capabilities
Bernard: should we get back to this in April?
JIB: from Mozilla's perspective, we don't have strong interest
in this approach given possible interop cross-OS issues
… we don't see any urgency
Harald: for face detection, we have a pretty solid way forward
via the explainer with use cases and justifications to support
adoption
… some of these additional camera controls may fit into that
new document
… if we accept constraints as a way to control camera drivers,
grouping them together make sense
JIB: but adding individual constraints is something we've used
mediacapture-extensions in the past
Youenn: the complexity of a boolean constraint is very
different from the more complex Face API detection
Dom: I'll work with the chairs to agree on a clearer path
forward then :)
Summary of resolutions
1. [58]Continue discussion in [59]issue #68
2. [60]close [61]#99 with no change
3. [62]modulo discussion on SHOULD guidance, we adopt the
displaySurface constraint proposal to manage Surface Hints
[59] https://github.com/w3c/mediacapture-extensions/issues/68
[61] https://github.com/w3c/mediacapture-extensions/issues/99
Received on Friday, 18 March 2022 06:38:56 UTC