- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Fri, 18 Mar 2022 07:38:51 +0100
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi, The minutes of our meeting last Tuesday (March 15) are available at: https://www.w3.org/2022/03/15-webrtc-minutes.html including the YouTube recording at https://youtu.be/GM56xH-jF8Q They're also copied as text below. Dom WebRTC WG March 2022 call 15 March 2022 [2]Agenda. [3]IRC log. [2] https://www.w3.org/2011/04/webrtc/wiki/March_15_2022 [3] https://www.w3.org/2022/03/15-webrtc-irc Attendees Present BenWagner, Bernard, Dom, Eero, Elad, Guido, Harald, Jan-Ivar, JohannesKron, Riju, Tuukka, Varun, Youenn Regrets caribou Chair Bernard, Harald, Jan-Ivar Scribe dom Contents 1. [4]TPAC 2022 2. [5]WebRTC-SVC 3. [6]WebRTC-Extensions 4. [7]Avoiding the “Hall of Mirrors” 5. [8]Display Surface Hints 6. [9]getViewportMedia update 7. [10]MediaCapture Extensions proposals 8. [11]Summary of resolutions Meeting minutes Recording: [12]https://youtu.be/GM56xH-jF8Q [12] https://youtu.be/GM56xH-jF8Q IFRAME: [13]https://www.youtube.com/embed/GM56xH-jF8Q?enablejsapi=1&rel =0&modestbranding=1 [13] https://www.youtube.com/embed/GM56xH-jF8Q?enablejsapi=1&rel=0&modestbranding=1 Slideset: [14]https://lists.w3.org/Archives/Public/www-archive/ 2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf [14] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf [15][Slide 1] [15] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=1 [16][Slide 3] [16] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=3 TPAC 2022 [17]🎞︎ [17] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=164 [18][Slide 8] [18] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=8 Dom: TPAC being considered as a hybrid event this year - please indicate whether you think you might join physically such an event? [from online poll: 3 Yes, 4 No, 4 don't know] [19]WebRTC-SVC [20]🎞︎ [19] https://github.com/w3c/webrtc-svc/ [20] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=362 [21][Slide 11] [21] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=11 Bernard: [22]issue #68 relates to behavior of getParameters() - unclear about re-negotiation (vs before/after negotiation) … [23]PR #69 has proposed text that clarifies that we're talking about **initial** negotiation (before/after) … if you re-negotiate, you'll still get the currently configured scalability mode [22] https://github.com/w3c/webrtc-svc/issues/68 [23] https://github.com/w3c/webrtc-svc/pull/69 Harald: wfm Jan-Ivar: is this correct? getParameters() algos are very explicit about what you get based e.g. on localDescription … some come from pending, others from current Bernard: let's say you change preference order for codecs, and you renegotiate (e.g. from VP8 with L1T2 to H264 that doesn't support scalability) - what happens then? … at what point do things change? JIB: even without setCodecPreferences, getParameters() may return different values depending on whether re-negotiation is happening or not … e.g. if you have a local offer, it might affect the results Bernard: looking at the VP8→H264 case, what should happen? HTA: as long as you're sending VP8, you should get L1T2 back … when you switch to H264, you get L1T1 back Bernard: that's what I would expect and what the text tries to convey … nothing changes until the new codec starts being used … JIB, could you write up your concern in [24]#68 ? [24] https://github.com/w3c/webrtc-svc/issues/68 RESOLUTION: Continue discussion in [25]issue #68 [25] https://github.com/w3c/webrtc-svc/issues/68 [26]WebRTC-Extensions [27]🎞︎ [26] https://github.com/w3c/webrtc-extensions/ [27] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=791 [28][Slide 16] [28] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=16 Bernard: Fippo gathered a list of hardware acceleration bugs that has been encountered … which raises the question of allowing to disable hardware acceleration … WebCodecs provides an enum to hint about whether or not use hardware acceleration [29][Slide 17] [29] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=17 Bernard: I looked into 2 approaches: setParameters, setCodecPreferences … the first one doesn't really work since the envelope of changes may not include hardware alternatives … it also only makes sense if mid-stream switch is necessary … the second approach goes through re-negotiation via setCodecPreferences() … How would you discover this? … Media capabilities may need amendment [30]https://github.com/ w3c/media-capabilities/issues/185 [30] https://github.com/w3c/media-capabilities/issues/185 Dom: should this be managed by the browser rather than left for developers to detect and manage? Bernard: this would be useful *when* developers detect a problem so that they don't need to wait for browsers to react to it Florent: there are also cases where a decoder interacts badly with a specific encoder JIB: for setParameters, there are read-only properties … putting it in codeccapability (which is returned to developers) means doubling the number of entries Bernard: you may not have to return it from Capabilitiy JIB: but then it doesn't fit very well with a notion of codec preference … we've also moved fingerprinting surface to media capabilities … I wouldn't want to reintroduce concerns without good reasons … it doesn't seem necessary to include that info if it is tackled as a preference Johannes: I understand this as developer wanting to disable hardware encoding as a short-term patch to the browser getting it fixed … it sounds like a recovery mode, more than a capability … also agree it's hard for developers to use it, but that it would have its uses Harald: routing around bugs is for specific implementations of the codec, which requires they know the specific implementation … does that point toward media capability as the right way to go? Bernard: that's where you'd find out if it's "smooth", "power efficient", "supported" Harald: if it's X's hardware encoder with software version Y, that may be the information you need to know whether or not to use it … not sure that fits with the Media Capabilities model Johannes: it would seem challenging … Also, the bugs that have been identified seem to be browser-specific … there are block-lists for this or that hardware; it may be worth investigate the possibility to move towards dynamic blocklists from browsers Riju: we share the GPU blocklist defined in Chrome with our driver team to get them to be fixed platform by platfomr Harald: no clear resolution, but some suggested paths worth exploring [31][Slide 18] [31] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=18 Harald: [32]issue #99 about RTP header extension … if an implementation supports an extension, it doesn't show up in Capabilities at the moment … is this problematic? if not, no change needed; if it is, we may need to surface that it exists but is disabled by default … you can get the information by inspecting the offer, so this may not be needed [32] https://github.com/w3c/webrtc-extensions/issues/99 Bernard: it's a convenience in the use case; there will be scenarios where you don't want to set it on by default Dom: is anyone asking for it? JIB: if this is for debugging, looking at the SDP is fine; if it's to control running code, it should be an API Harald: the most likely example would be if transport-cc is not supported, I fallback to another congestion control … I think it can be shimmed by creating an offer and dancing with a throw-away peer connection Dom: not hearing a lot pushback, nor a lot of demand either; maybe wait until we have more demand if it can be designed in a way that is backwards compatible Harald: yes, it can be done later in a backwards compatible RESOLUTION: close [33]#99 with no change [33] https://github.com/w3c/webrtc-extensions/issues/99 [34]Avoiding the “Hall of Mirrors” [35]🎞︎ [34] https://github.com/w3c/mediacapture-screen-share/issues/209 [35] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=1970 [36][Slide 21] [36] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=21 [37][Slide 22] [37] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=22 [38][Slide 23] [38] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=23 [39][Slide 24] [39] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=24 Elad: the proposal would to add a new member to the DisplayMediaStreamContraints à la includeCurrentTab to hint to the UA whether or not to include the current tab or not [40][Slide 25] [40] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=25 Elad: influencing the user decision in picking display surfaces has security implications … but I argue that in this case, it is not problematic: the risks of selection are of two nature: … - the attacker influence the user to share a surface under the attacker's control … - the attacker influences the user to share a tab with sensitive content (e.g. their bank account) … but excluding-self is orthogonal to these [41][Slide 26] [41] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=26 Elad: if we agree this is worth solving; the question becomes what's the default value should be … if we make it optional, this could be left as a UA dependent default [42][Slide 27] [42] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=27 Elad: a potential expansion would cover additional surfaces (e.g. screen) JIB: [43]#209 has the detailed discussion - what is the proposal we're reviewing? [43] https://github.com/w3c/mediacapture-screen-share/issues/209 Elad: I suggest adding a dictionary member (either include or exclude) that serves as a hint, with no change to current behavior JIB: I like this API, but would want the default to be "false" … I don't think this is so much about hall of mirrors - a symptom that the UA could address either ways … the real issue is that in many cases, self-capture is NOT the intent … long term, self-capture would be getViewportMedia … some sites that want self-capture to be part of the selection - they would need to opt-in … also, TAG guidance is that undefined maps to false Elad: re default true - agree … re alternative approaches Youenn suggest, I don't think ti works for current tab (it would work for current screen) … I agree with your characterization that the root cause is if you're not ready to self capture … I suggest we don't take getViewportMedia into account since there is little visibility in terms of its adoption … I think we should avoid breaking apps, even if shortly JIB: I think we should keep that separate from what implementations do … here the question is what's the most frequent case, most sites wouldn't want to it Elad: lost of self-capture happning every year; assume a lot of it not accidental Youenn: re security, the current spec doesn't deal much with tab capture in that regard … we're bringing more and more control to what UAs will show, and that means we need to strengthen the guidance to UAs … Chrome has some mitigations in this space that might serve as a starting point … If this is a hint, this is fine … Some implementations might remove entirely the possibility to select the tab, that's something new … hints allow to push users towards the more meaningful choice, but leave the user in charge of the final choice … re hall of mirrors - I don't think this is solving it … some native apps have implemented current-app blurring to solving the issue … cropping would be another way to solve the issue … if it's only a hint, it's fine; but if it brings a required behavior, I don't think we should go there … also want more security guidance … and keep issue open on addressing other aspects of hall of mirrors Elad: could you help with the security guidance? Youenn: Ideally would like to get the work that Chrome has done Dom: +1 on a hint; if boolean is problematic, we can use an enum to avoid the default value fallback Elad: happy to help with getting the security considerations with guidance from Youenn on what he wants to see Harald: hearing overall support to continue in that direction, towards a hint [44]Display Surface Hints [45]🎞︎ [44] https://github.com/w3c/mediacapture-screen-share/issues/184 [45] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=3236 [46][Slide 30] [46] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=30 Elad: similar to previous issue, but distinct … some apps want to hint to the UA that it is will geared toward a particular display surface type … I think there is agreement that this is worth supporting … but we've struggled to find an approach that everyone likes … I'm suggesting a compromise based on the discussion which would be: … - use constraints as a mechanism … - make it a hint with UA dependent behavior Youenn: hint is fine; it could be a constraint as a model, but with an improved simpler WebIDL surface Elad: reject on "exact"? Youenn: "exact" would be ignored Harald: -1 in integrating this in the proposal - I hate irregularities JIB: +1 to Harald; "exact" is already a type error in getDisplayMedia which already narrows down the constraint mechanism … agree with reusing displaySurface … I have concerns with an app asking for a monitor - I don't think we should provide this level of control … I proposed text to steer away users from monitor capture Elad: this is a hint - UAs can decide not to follow it Dom: with a hint, UAs can provide the best experience they can … not sure the SHOULD would achieve much if the main target isn't interested in SHOULD Youenn: the SHOULd owuld be useful for new implementors Elad: there is merit to that … non-normative language pointing to the risk would be good JIB: the SHOULD already allows for this; given Chrome has a good motivation, this feels like an exact reason why SHOULD would be used RESOLUTION: modulo discussion on SHOULD guidance, we adopt the displaySurface constraint proposal to manage Surface Hints [47]getViewportMedia update [48]🎞︎ [47] https://github.com/w3c/mediacapture-viewport [48] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=3906 [49][Slide 31] [49] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=31 JIB: FYI, there is a PR up to describe getViewportMedia which hopes to bring to a call for adoption soon [50]Viewport Capture Unofficial Draft [50] https://w3c.github.io/mediacapture-viewport/ Youenn: we probably need a different set of constraints than the ones for getDisplayMedia … re audio, we need to think about whether to include system level audio or just current tab JIB: currently restricted to current tab Harald: if it can't be isolated, no audio should be captured JIB: there are pending PRs that I hope will be merged before we start the call for adoption Elad: the general intent of this work is awesome; looking forward to see it implemented … that said, until we see it adopted, we need to be careful in basing our decisions on this work, or consider relaxing some of the restrictions Youenn: has there been any outreach to web developers re x-origin isolation? Elad: the feedback I got from developers was this was a blocker for them Bernard: ditto JIB: I agree this is taking the long view here … hence the flexibility we're showing on getDisplayMedia … re using different constraints, we can change it when it shows as needed Youenn: displaySurface would be one case where this is needed [51]MediaCapture Extensions proposals [52]🎞︎ [51] https://github.com/w3c/mediacapture-extensions/ [52] https://www.youtube.com/watch?v=GM56xH-jF8Q#t=4238 [53][Slide 34] [53] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=34 Riju: this is follow up from a conversation that started at TPAC [54][Slide 35] [54] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=35 Riju: [55]PR #48 is allowing in-browser face detection … when we showed this last time, the feedback included: … - tie it to VideoFrame rather than MediaStreamTrack, which the PR reflects … - future-proofing the bounding box approach - this is addressed with the Contour described in the PR, with a way for the developer to request something other than the default 4 … - another request was to have a face mesh - which is now exposed as an additional property (although there is no native support for it today) … - face expression was raised as a concern, so we removed it … - making face detection work with transform stream [55] https://github.com/w3c/mediacapture-extensions/pull/48 [56][Slide 36] [56] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=36 Riju: we've put up an example to show how they would work together … we've done early testing that shows improved power consumption - more specific numbers to be shared soon Youenn: good to expose it on VideoFrame; but would also be good to expose in requestVideoFrame callback e.g. for use with canvas … re using "exact" constraints - I would expect "exact" not to be allowed in this … There seems to be switches to give hints to cameras - do we need several switches to allow per-algo enabling, or could we have a single "face detection" switch? Riju: e.g. "is face detection supported"? Youenn: why multiple switches if a single one is good enough, leaving it to the Web app to deal with what they're obtaining Riju: for instance, contour points would allow future support for additional more detailed contours Youenn: since the camera is doing the work, not clear we need to give more hints to the driver Riju: contour/mesh were added for extensibility Youenn: maybe reduce to what's implementable, while future-proofing it Bernard: high level questions about the API surface … I understand the supported contraints & capabilities are used to provide the basic parameters for the algorithm in the driver … videoFrame.detectedFaces is already done by the driver … as opposed to have a promise-based method to which the parameters would be given … if your camera driver doesn't support it, you wouldn't have it Riju: going through promises, this would impact performance and re do work the driver has already done … OS level face analysis would duplicate computation already done in the driver JIB: so, it's a camera API - only available to sources that are camera? Riju: right JIB: my concern is that there is another effort in the WICG, the shape detection API - how does it relate to it? … would be unfortunate to have it to deal with face detection differently depending on the source Riju: shape detection work on images, can be called multiple time … no face tracking available, which helps detecting face across frames efficiently … face detection is based on OS level face analysis, which duplicates the driver work and is less power efficient / robust … we started from that API in our effort in this space - we feel this new approach gives much better results … FaceDetector is only supported in Windows atm; the work has stopped afaict Bernard: so you're saying the WICG work is not going ahead? Riju: I can check the status with Reilly (but my team was the one behind the implementation) Harald: I share some of JIB's worries … we have functions today that depend on high quality face detection e.g. background blur … I'm worried about having these different interfaces to solve the same problem … esp if some interfaces end up proprietary … if the proprietary interfaces provide much higher quality than what standard interfaces can provide … hence my pushback on making contours and meshes available in the API … I'm still not happy with the design that seems to be totally focused on axing this on hardware/driver resources rather than a representation API … it has a bit of that flavor, but there is still a lot of a sense of configuring the camera … also I'm surprised this only gives a 50% factor over media pipe … but in general, this feels like a major new way of treating media information … I'd like to see be proposed as a proposal, not as a set of API patches … with an explainer, use cases, examples - that we typically put together before agree on taking it up Riju: no need to configure the driver … the PR includes examples Harald: I'm thinking of what application would be use this for, what problems to solve Dom: what an explainer would cover Riju: I can come up with that Dom: happy to help with the logistics of making it happen Riju: is the question about whether this is useful or not? harald: yes bernard: or rather whether it handles all the use cases people want Jan-Ivar: e.g. tying this with camera may become obsolete or too limiting … having an API that isn't as strongly tied to hardware acceleration Harald: I'd like to have a better understanding of which apps want a rectangle around a face Youenn: encoders actually optimize around faces if such metadata are available … +1 on defining API that can obtain metadata from the hardware or a TransformStream JIB: among other things, having less hardware-dependency allows UAs to step in [57][Slide 37] [57] https://lists.w3.org/Archives/Public/www-archive/2022Mar/att-0004/WEBRTCWG-2022-03-15.pdf#page=37 Riju: backgroundBlur has more platform API support than replacement Youenn: iOS has the ability to switch on & off background blur, fully outside of the Web app, and fully dynamic … the Web app could not unblur if the user has set this us at the OS level … (but not vice versa) … that situation is not well supported by constraints … we may need a way to surface whether a constraint *can* be changed (and to signal when it can no longer be changed) JIB: this is a case where constraints work very well - the app states its ideal … background blur is popular, would be good to support it Youenn: I don't think "ideal" suffices to expose the situation … re backgroundBlur level - it's not settable on iOS; are there platforms that would benefit from it? Riju: no platform API supports this, but some software models have that parameters … but I understand some platforms are working towards making it settable Youenn: but without knowing the algorithm, setting a particular value would be hard for developers … we may need a boolean instead JIB: part of the question is whether this needs to be controllable by apps vs the UA harald: in audio, we've encountered cases that it's valuable to tell have manipulating settings that are supposed to be useful in the driver, but actually creates issues … e.g. double echo cancellation control … the most important control we have is to turn platform effects off; the second was to detect the situation to ask the user to turn it off Riju: on the last three proposals (lighting correct, face framing, eye gaze correction), any sense of interest? … the goal is to give options to developers on whether or not to use hardware capabilities Bernard: should we get back to this in April? JIB: from Mozilla's perspective, we don't have strong interest in this approach given possible interop cross-OS issues … we don't see any urgency Harald: for face detection, we have a pretty solid way forward via the explainer with use cases and justifications to support adoption … some of these additional camera controls may fit into that new document … if we accept constraints as a way to control camera drivers, grouping them together make sense JIB: but adding individual constraints is something we've used mediacapture-extensions in the past Youenn: the complexity of a boolean constraint is very different from the more complex Face API detection Dom: I'll work with the chairs to agree on a clearer path forward then :) Summary of resolutions 1. [58]Continue discussion in [59]issue #68 2. [60]close [61]#99 with no change 3. [62]modulo discussion on SHOULD guidance, we adopt the displaySurface constraint proposal to manage Surface Hints [59] https://github.com/w3c/mediacapture-extensions/issues/68 [61] https://github.com/w3c/mediacapture-extensions/issues/99
Received on Friday, 18 March 2022 06:38:56 UTC