- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Mon, 2 May 2022 15:19:24 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi,
The minutes of our call on April 26 are available at:
https://www.w3.org/2022/04/26-webrtc-minutes.html (with youtube
recording)
They're also copied as text below.
Dom
WebRTC April 2022 VI
26 April 2022
[2]Agenda. [3]IRC log.
[2] https://www.w3.org/2011/04/webrtc/wiki/April_26_2022
[3] https://www.w3.org/2022/04/26-webrtc-irc
Attendees
Present
Anssi, Carine, ChrisCunningham, Dom, Elad, Florent,
Harald, Jan-Ivar, MichaelSeydl, Ningxin,
PatrickRockhill, PhilippHancke, Sergio, TimP, Youenn
Regrets
-
Chair
Bernard, Harald, Jan-Ivar
Scribe
dom, youenn
Contents
1. [4]WebNN Integration with real-time video processing
2. [5]WebRTC Extensions
1. [6]Issue [7]#95
2. [8]Issue [9]#100
3. [10]https://github.com/w3c/mediacapture-extensions/issues/
47 Voice Isolation Constraint
4. [11]support for contentHint in Capture Handle
5. [12]WebRTC Extensions
6. [13]Avoid user-confusion by avoiding offering undesired
audio sources
7. [14]Region Capture
8. [15]Summary of resolutions
[7] https://github.com/w3c/mediacapture-region/issues/95
[9] https://github.com/w3c/mediacapture-region/issues/100
Meeting minutes
Recording: [16]https://www.youtube.com/watch?v=qSlXLqouxCs
[16] https://www.youtube.com/watch?v=qSlXLqouxCs
IFRAME:
[17]https://www.youtube.com/embed/qSlXLqouxCs?enablejsapi=1&rel
=0&modestbranding=1
[17]
https://www.youtube.com/embed/qSlXLqouxCs?enablejsapi=1&rel=0&modestbranding=1
Slideset: [18]https://lists.w3.org/Archives/Public/www-archive/
2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf
[18]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf
[19]WebNN Integration with real-time video processing [20]🎞︎
[19] https://github.com/webmachinelearning/webnn/issues/226
[20] https://www.youtube.com/watch?v=qSlXLqouxCs#t=167
[21][Slide 10]
[21]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=10
[22][Slide 11]
[22]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=11
[23][Slide 12]
[23]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=12
[24][Slide 12]
[24]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=12
[25][Slide 13]
[25]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=13
[26][Slide 15]
[26]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=15
ningxin: slide 15 is high level pipeline to build a background
blur video pipeline
Two implementations: WebGL and WebGPU/WebNN.
texture uploads to GPU in both cases.
Last shader is taking 3 input: original image, blurred image,
and computed segmentation map.
Description of the perf issues, in particular CPU usage and GC.
Bernard: is there a copy on the output at offscreencanvas
level?
Ningxin: not sure.
Tim: is the perf acceptable? or do we need to make massive
improvements?
ningxin: we need to measure battery impact
dom: we are doing this prototype to evaluate what HW
acceleration can bring us. And identify potential roadblocks
when trying to do video processing on media capture
for instance color conversion or pixel format.
youenn: looking at 20% CPU on GC - can that be fixed by
implementations, or is it an architectural issue with having
lots of objects created per frame?
… on native, there is usually a buffer pool to help with that
… does that need to be surfaced to the JS, or can that be dealt
solely by the UA?
ningxin: GPUBuffers are created beforehand. Some objects are
created for every frame, like textures.
There are ways to avoid many object allocations.
at JS level. Maybe UA optimisations might help.
dom: what are the next steps for this project?
ningxin: 1. enable WebGPU backend.
2. new APIs that allow import frames as GPU textures and see
whether that will improve efficiency.
3. Improve VideoFrame GC PR: we will try out when it is merged
in Chrome.
youenn: re CPU efficiency - this is moving between main thread
and worker thread, that may have a small perf impact
… doing everything in the worker might be helpful once that's
possible
[27]WebRTC Extensions [28]🎞︎
[27] https://github.com/w3c/webrtc-extensions
[28] https://www.youtube.com/watch?v=qSlXLqouxCs#t=1760
[29][Slide 19]
[29]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=19
Issue [30]#95 [31]🎞︎
[30] https://github.com/w3c/webrtc-extensions/issues/95
[31] https://www.youtube.com/watch?v=qSlXLqouxCs#t=1780
[32]Media Capabilities issue 185
[32] https://github.com/w3c/media-capabilities/issues/185
[33][Slide 20]
[33]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=20
[34][Slide 21]
[34]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=21
[35][Slide 22]
[35]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=22
[36][Slide 23]
[36]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=23
[37][Slide 24]
[37]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=24
Bernard: question to WG is: is it a goal for MC to deprecate
getCapabilities?
youenn: my understanding is that media capabilities is really
about audio/video capabilities
… so it doesn't make sense to expose e.g. CN there
… they should stay in WebRTC getCapabilities
… getCapabilities() being sync is problematic; that's less of
an issue for software capabilities such as CN
… so deprecated getCapabilities fully is not a goal, but
partially, yes
Florent: +1 on the approach usability of resulting split is a
concern
chris: seems fine to use that split; do we want to return rtc
codec info from media capabilities?
… if so, please take at look at [38]https://github.com/w3c/
media-capabilities/issues/185
[38] https://github.com/w3c/media-capabilities/issues/185
youenn: +1 on disambiguating the outcome of this situation
… listing all codecs in just one call is a non goal
… an SFU is typically only interested in a few codecs
… for P2P, setCodecPreferences is probably not needed in the
first place - you can deal with a generic codec negotiation
jib: would be good to clarify if we want to deprecate "real"
codecs from getCapabilities? this sounds like a good long term
goal for me
harald: I worry that RTX/RED/FEC info needs to be available
somewhere
… getCapabilities has known problems and would be the only way
to get it
… changing getCapabilities is actually harder to deprecating it
… in the long run, it's best to deprecated getCapabilities and
replace it with a better dedicated API
Florent: two different scenarios for setCodecPreferences:
talking with an SFU in which case you can make specific codec
queries; in a P2P scenario, if you can't enumerate all the
codecs, you won't be able to call setcodecpreferences
… this would require hardcoding a list of codecs
… is there a way to make getCapabilities evolve in a shape that
would satisfy everyone?
… getCapabilities+setCodecPreferences has a lot of current
usage, will be hard to deprecated
Issue [39]#100 [40]🎞︎
[39] https://github.com/w3c/webrtc-extensions/issues/100
[40] https://www.youtube.com/watch?v=qSlXLqouxCs#t=2747
[41][Slide 25]
[41]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=25
[42][Slide 27]
[42]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=27
youenn: might be fine, but I worry about the defaults? would
they be the same across browsers?
… there are current codecs that are defaults, but that may need
to evolve over time
… this could create Web compat issues
Sergio: some of the codecs are receive-only
… the list would be based on common sense, but I don't have a
strong opinion
youenn: my worry is about P2P - if the defaults aren't same
across UAs, the negotiation will fail
sergio: my suggestion was to use defaults in the offer, and
adapt the answer based on the offer
harald: two interfaces needed: the list of codecs currently
willing to offer, the set of codecs you can offer
… the 1st one might be getCapabilities, the proposal on the
slide for the 2nd
… in terms of interop, MTI codecs should be the safety net, and
they should be in the mandatory-to-offer
florent: the proposal seems ot have a lot of overlap with
setCodecPreferences / getParameters - could we improve these
instead of coming up with new API
[Philipp supports this on the chat]
Sergio: would be fine; I started from the rtp header
extensions, maybe that should apply there?
florent: the difference is that there is already an API to set
codec preferences
sergio: but header extensions could be added there too?
Bernard: let's continue the discussion in the issue
… or work on a matching PR
[43]https://github.com/w3c/mediacapture-extensions/issues/47 Voice
Isolation Constraint [44]🎞︎
[43] https://github.com/w3c/mediacapture-extensions/issues/47
[44] https://www.youtube.com/watch?v=qSlXLqouxCs#t=3316
[45][Slide 41]
[45]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=41
[46][Slide 42]
[46]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=42
Resolution for issue 95: mark issue as ready for PR
[47][Slide 43]
[47]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=43
[48][Slide 44]
[48]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=44
[49][Slide 45]
[49]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=45
youenn: it makes sense; reasonable to ignore `noiseSuppression`
… there is also `echoCancellation` in the audio pipeline
… does it make sense to do `echoCancellation` when this is set?
harald: I think it's mostly orthogonal
youenn: so `echoCancelation: false` is compatible with
`voiceIsolation: true`
… it may be challenging for some implementations to support
these combinations
jan-ivar: I like this too; what should the default be? that may
bring concerns
harald: we can discuss this in the PR
… conservatively, the default should be the current behavior
(false)
dom: instead of boolean, we could use strings for extra
flexibility.
RESOLUTION: mark voiceIsolation [50]issue #47 as ready for PR
[50] https://github.com/w3c/mediacapture-extensions/issues/47
[51]support for contentHint in Capture Handle [52]🎞︎
[51] https://github.com/w3c/mediacapture-handle/issues/35
[52] https://www.youtube.com/watch?v=qSlXLqouxCs#t=3867
[53][Slide 48]
[53]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=48
[54][Slide 49]
[54]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=49
[55][Slide 50]
[55]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=50
[56][Slide 51]
[56]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=51
youenn: setting the track hint is unnecessary - if the capturer
is setting the hint on its side, the UA knows that the track
being captured is text - there is no need to transmit it to the
capturer
… except maybe if WebCodecs is the picture
… having the UA taking care of this seems preferable
elad: the suggestion would be that the captured content
self-declare its type and the UA uses it?
… but that removes the liberty of the capturer to decide
whether to use the hint or not
… which could be based on e.g an allowlist
… autodetection by the UA would have its own limitation
bernard: re the WebCodecs case - contentHint is not
automatically consumed by WebCodecs, it's up to the app to
apply it as codec setting
jib: I agree with youenn that the UA is in good place to
shortcircuit the capturer part
… the proposal could be useful for the capturee side
… exposing further metadata to the controller might be an
interesting addition to my capturecontroller proposal
youenn: it could be exposed at the videoframe level
jib: I see agreement on the need, not yet on the API shape
[57]WebRTC Extensions [58]🎞︎
[57] https://github.com/w3c/webrtc-extensions
[58] https://www.youtube.com/watch?v=qSlXLqouxCs#t=4571
[59]Avoid user-confusion by avoiding offering undesired audio sources
[60]🎞︎
[59] https://github.com/w3c/mediacapture-screen-share/issues/219
[60] https://www.youtube.com/watch?v=qSlXLqouxCs#t=4583
[61][Slide 54]
[61]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=54
[62][Slide 55]
[62]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=55
[63][Slide 56]
[63]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=56
Tim: is this only applicable for echo management?
elad: it could be that an application is interested in
recording a specific tab, no more than that.
Tim: this use case does not seem address: identifying the
desired tab would be needed.
Elad: some VC applications usually do not want to capture
system audio.
Jan Ivar: supportive, how about reusing displaySurface
constraint here?
Elad: Might work for me.
Jan Ivar: I would like to remove monitor from here.
dom: if we do not include monitor here, audio: true might
capture system audio. But applications would not be able to
explicitly ask for system audio.
dom: displaySurface would be a strange name for audio.
youenn: let's enumerate the different approaches:
avoidSystemAudio, displaySurface, sources
youenn: scope is unclear, we need to clarify this before going
to PR.
youenn: different properties allow to do feature detection on
what kind of recording the UA can do
elad: my focus is only limiting access to system audio, but I
also think flexibility is helpful
timp: back to my echocancellation point - the constraint could
be linked to whether the source can be echocancelled
Harald: source being echo cancellable is a second concern.
Biggest point is avoiding system audio.
Tim: as well as window audio.
harald: echoCancellation is a secondary concern - capturing
system audio could disclose info from a 3rd party
[64]Region Capture [65]🎞︎
[64] https://github.com/w3c/mediacapture-region
[65] https://www.youtube.com/watch?v=qSlXLqouxCs#t=5905
RESOLUTION: continue discussions on GitHub.
[66][Slide 59]
[66]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=59
youenn: [67]#11 is an issue on the shape of the CropTarget API
… given current chrome implementation work, feels it's useful
to converge on the API shape
[67] https://github.com/w3c/mediacapture-region/issues/11
[68][Slide 60]
[68]
https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf#page=60
youenn: do we want to attach the API to element or to
MediaDevices?
… element feels like a better path
jib: +1
elad: I prefer mediaDevices given its linkage to screen capture
youenn: cropTarget is linked to MediaStreamTrack, not
mediaDevices
… and it's really tied to an element
elad: it can be used through an object you get from
getDisplayMedia
youenn: but with a detached mediaDevices, you can't reject the
promise
dom: prefer element option.
youenn: next question is attribute vs method
… slight pref for attribute, but no strong feeling
elad: there is a cost to minting a crop target - we mark the
element in the rendering pipeline in specific ways that we
shouldn't abuse
youenn: I thought you were going to use a lazy approach to
reduce that cost
elad: lazy tagging might help, but this needs more thinking
jib: +1 to attribute
… developers value trump implementators value
elad: I don't think it matters much to developers in the first
place
harald: disagree with messing with the element interface, and
on hiding the fact that the operation has a cost
… also async (promises) may be needed for some implementations
… let's not hide the reality of the situation
jib: the cost seems to be Chrome-specific
… the real goal of this API is a transferable reference
youenn: +1
… other APIs in the past have re-used the element interface,
have made similar decisions on methods / attributes, async vs
sync
… we should follow existing implemented patterns
dom: is there any other API that may be use this tranferable
reference?
youenn: that's something I bring up in the issue
elad: this may create unsafe usage for this well-defined target
jan-ivar: this could be evaluated
hta: but this shouldn't block progress on the specific narrow
goal we have
youenn: my focus is aligning with current API patterns for this
API
elad: the TAG will chime in; but if they don't give a clear
specific suggestion
… we could move with the current design that can be polyfilled
Received on Monday, 2 May 2022 13:19:28 UTC