- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Mon, 30 Mar 2020 18:50:55 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi, The minutes of our call earlier today are available at: https://www.w3.org/2020/03/30-webrtc-minutes.html and copied as text below. Dom WebRTC Virtual Interim 30 March 2020 [2]IRC log. [2] https://www.w3.org/2020/03/30-webrtc-irc Attendees Present Bernard, DomHM, Florent, Harald, Henrik, Jan-Ivar, Jianjun, SamDallstream, TimPanton, Youenn Regrets - Chair Bernard, Harald, Jan-Ivar Scribe dom Contents 1. [3]WebRTC 1. [4]WebRTC Features at risk 2. [5]ISSUE-2495 When is negotiation complete? 3. [6]ISSUE 2502 When are effects of in-parallel stuff surfaced? 4. [7]Media Capture and Streams 1. [8]Issue 671 new audio acquisition 2. [9]ISSUE 639 Enforcing user gesture for gUM 5. [10]Next meeting 6. [11]Summary of resolutions Meeting minutes WebRTC WebRTC Features at risk Bernard: a few unimplemented features not yet marked at risk … 3 issues filed related to that … first one is Issue 2496 - the voiceActivityFlag exposed in SSRC, not implemented anywhere … any disagreement to marking it at risk? Henrik: SGTM Bernard: we have one unimplemented MTI per issue 2497, partialFramesLost … should we remove it from the MTI list? Jan-Ivar: no objection to unmark that one; will we get implementations for the other ones? Henrik: they need to be moved from one dictionary to the other - they've been implemented, they just need to be moved into a different object JIB: it's not clear to us yet how easy it will be to implement in Firefox; pointers to upstream webrtc.org hooks would help Resolution: remove MTI marker on partialFramesLost Bernard: last one is multiple DTLS certificates, not implemented anywhere HTA: the goal was to help support signed certificates, which is completely unspecified Dom: so if we remove support for it, the idea would be to say the spec only uses the first certificate in the list TimP: wasn't the background of this support for multiple kind of certificates? Bernard: with full support for DTLS 1.2, that's no longer relevant Bernard: I'm hearing consensus on all of these ISSUE-2495 When is negotiation complete? JIB: this emerged while writing tests for WPT, but is applicable beyond testing … "Perfect negotiation" is the pattern we recommend in the spec that helps abstract away the negotiation from the rest of the application logic … having a negotiationended event would help avoid glare, simplify the logic … the obvious approach to detect the end of negotiation is racy … there are workaround, action-specific spin-tests (while loops) … but that's bad, leading to timeouts … I've also tried another workaround by dispatching my own negotiated event at the exact right time … this is slightly better, but we can still miss cases … can we do better? I have 3 proposals … fire a negotiationcomplete from SRD(answer) if renegotiation isn't needed … one downside is that subsequent actions may delay the event if further negotiations is needed in some edge cases … Proposal B is a boolean attribute for negotiationneeded - needs careful implementation in relation to the negotiationneeded event … it's also delayed by subsequent actions … Proposal C: an attribute exposing a promise for negotiationcomplete … it's better because it's not delayed by subsequent actions (by replacing promises as new negotiations get started) Henrik: compared to proposal A? JIB: imagine you call addTransceiver-1 & addTransceiver-2, you have to wait until addTransceiver-2 before the event fires (which you don't in proposal C) Henrik: you can build your own logic if you care about partial negotiations - what you want to know in general is "am I done or not"? HTA: I question the question - why should I care if negotiation is complete? … what you have here is indeed a problem, but what the app cares about is whether the transceiver is connected to a live channel or not … you don't have this problem with datachannels since you have an onopen event … if we want to solve this at all (I would prefer not adding any API at this point), I think we should look at a signal on the transceiver availability Bernard: don't you get that from our existing states, e.g. via the transports? Harald: we have it with currentDirection, but without an event, it has to be polled JIB: I think apps do need to know whether the transceiver is ready or not, and having that done with a timeout is not great HTA: what I'm saying is what matters is the readiness of the transceiver, not the state of the negotiation … if we want to add anything here, it should be a directionchange event to the transceiver TimP: it could be done with proposal C which indicates "what" is complete (i.e. which transceiver is ready) … otherwise, I agree you want to know what it is you got JIB: you would get that via JS closure Henrik: I think this is a "nice-to-have" - useful for testing & debugging; but I think it's a problem that can be solved with the existing API JIB: I don't think this can be polyfilled, given that negotiationneeded is now queued … negotiationneeded can be queued behind other operations in the PC Henrik: you can detect this for each of your negotiated states by observing which changes are actually reflected (with different logic for each type of negotiation) … this would be nicer, but I don't think it's needed JIB: you mentioned setStreams - it cannot be observed locally … another advantage of the promise is that it lets you determine if you're still on the same "negotiation train" by comparing promises Youenn: it would be interesting to see if libraries built on top of PC are implementing that pattern … this might be a good way to determine its appeal Henrik: it would be great for debugging for sure, esp in the age of perfect negotiation Youenn: so let's wait to see what apps adopting perfect negotiation before committing to this Conclusion: keep for post 1.0 (in webrtc-extension?) ISSUE 2502 When are effects of in-parallel stuff surfaced? Henrik: the singaling/Transceiver states defined in JSEP and the API can't be the same to the cost of racy behavior … which means the requirements imposed by JSEP on these states create ill-defined / inconsistent behaviors … Proposals to address this: Proposal A: we make addTrack dependent only on WebRTC states, not JSEP states … this is probably what the spec says, not what implementations do … Proposal B: we make addTrack depend on a "JSEP transceiver", but would be racy and create implementation specific behaviors JIB: I agree there is a race in JSEP … JSEP was written without thinking about threads at all … the problem is not really about whether we're in a JS thread or not … we have to make copies of things Henrik: my mental model is that WebRTC JS shallow objects refer to JSEP objects … the only problem is with addTrack because of recycling of transceivers JIB: the hygienic thing would be to copy state off from JSEP when looking at transceivers. Is that proposal A? Henrik: it's implicit in proposal A JIB: the only problem with that with your example on slide 17 - this would leave a hole e.g. in the context of perfect negotiation Henrik: I think that's a better alternative than starting to meddle with internal JSEP objects … the hole here is that if you're unlucky, you need another round of negotiation … and in that situation, you would be in a racy scenario in the first place HTA: the code of slide 17 is not compatible with perfect negotation Henrik: I think proposal A is the only sane approach HTA: this sounds ready for a pull request JIB: I think the spec is currently racy given "JSEP in parallel" so it's more than an informative change Resolution: getTransceivers() SHALL NOT be racy Media Capture and Streams Issue 671 new audio acquisition Sam: Sam Dallstream, engineer at Microsoft … this is a feature request / issue on the spec … at the spec stands today, it is hard to differentiate streams meant for speech recognition vs communication … the current implementations are geared towards communication, which sometimes is at odd with the needs for speech recognition … e.g. in comms, adding noise can be good, but it hurts with speech recognition … slide 22 and 23 shows the differences of needs between the two usages, extracted from ETSI specs … the first proposal to address this would be a new constraint (e.g. "category") that allows to specify "default", "raw", "communication" "speechRecognition" … it translates well to existing platforms: windows, iOS, Android have similar categories … the problem is that it competes with content-hint in a confusing way - content-hint is centered around default constraints AND provide hints to consumer of streams … whereas this one is setting optimization on the stream itself (e.g. levels of echo canceling) … A second proposal is to modify the constraints to make them a bit more specific, and add a new hint to content-hint … the advantage is that it fits the current content-hint draft, with more developer freedom … but it may be hard to implement though … Would like to hear if there is consensus on the need, and get a sense of the direction to fulfill it Henrik: for clarification, for echoCancellation, it's not turning it off, it's tweaking it for speech recognition Sam: right - right now echoCancellation it's a boolean (on or off) HTA: but then how does it fit well well with the existing model? Sam: I meant it's easier for API consumers, but you're right it conflicts with other constraints Bernard: this is not about device selection here JIB: indeed, most of this is done in software land in any case Henrik: right, here it's more about disable/enabling feature JIB: what's the use case that can't be done by gUM-ing & turn off echoCancellation, gainAutoControl, ambientNoise? Bernard: it's not on & off TimP: e.g. in speech interactions, you don't want the voice AI to hear itself Sam: Alexa right now turns off everything and then adds their own optimization for speech recognition … so this can already be done, but the idea is to allow built-in optimizations so that not everyone has to do their own thing Youenn: do systems provide multiple echo canceller? … I don't think you can do that in iOS Sam: that's why the second proposal isn't as straightforward Henrik: the advantage of these categories is that they vague enough that implementations can adjust depending on what the underlying platforms provide … but then it's not clear exactly what the hint does HTA: I would expect a lot of variability across platforms in any case Henrik: as is the case for echoCancellation: true HTA: indeed (as the multiple changes of the impl in Chrome show) Henrik: it sounds like it is hard-enough to describe, implementation-specific enough that it should be a hint JIB: I think that's fair to say that the audio constraints have been targeted a the communications use case … not sure how much commitment there is for the purpose of speech recognition Sam: right Henrik: with interop in mind, echoCancellation: true worked because everyone did their best job at solving it, not doing it the same thing … to get that done with this new category, we would need the same level of commitment and interest from browser vendors … the alternative is turning everything off and doing post process in WebAudio/WASM TimP: another category beyond comm, speech-rec here is broadcast … it shouldn't be a two-states switch JIB: anything here that couldn't be solved with WebAudio / AudioWorklets Sam: I would need to take another look at that one HTA: you would still need a "raw" mode Youenn: maybe also look at existing open source implementation of ambient noise and whether they share some rough parameters Sam: it sounds like leaning towards 2nd proposal Dom: maybe first also determine what can be done in user land already with Web Audio / Web Assembly … if this is already doable there, then maybe we should gain experience with libraries first HTA: given we already have a collection of hints in content-hint that have been found useful, it's kind of easy to add it there Bernard: would this applies up to gUM? HTA: yes, that's already how it works JIB: if we're thinking adding a new hint, we may need new constraints specific to speech-recognition [discussion around feature detection for content-hints] ISSUE 639 Enforcing user gesture for gUM Youenn: powerful APIs are nowadays bound to user gesture … if we were designing gUM today, it would be as well … but that's not Web compatible to change now … can we create the conditions to push Web apps to migrate to that model … PR 666 proposes to require user gesture to grant access without a prompt … I've looked at a few Web sites; whereby.com works with the restrictions on … it wouldn't work in Hangout or Meet … Interested in feedback on the approach and availability to help with webrtc app developers outreach Youenn: the end goal would be that calling gUM without user gesture should be rejected … user gesture is currently an implementation-dependent heuristic - this is being worked on Henrik: I think we would need it to be better defined … it is also linked to 'user-chooses' Youenn: the situation is very similar to getDisplayMedia where Safari applies the user gesture restriction … it could be the same with gUM JIB: I like the direction of this; we could describe it as privacy & security issue … with feature-policy, there is a privacy escalation pb through navigation … jsfiddle allowed all feature policies, so from my site I could have navigated to my jsfiddle, got priviledged there before navigating back with an iframe … so that sounds like an important fix … the prompting fallback sounds interesting … denying on page load might be harder to reach … it's not clear that same-origin navigation should be blocked Youenn: user gesture definition is still a heuristic, these could fit into that implementation freedom HTA: how much legitimate usage would we break? … before progressing this, we should have a deployed browser with a counter to detect with/without user gesture Youenn: Webex and Hangout call it on pageload, so that would make the counter very high HTA: so will someone get data? Youenn: I don't think Safari can do this; would be happy if someone can do this … I can reach to top Web site developers HTA: would anyone at Mozilla interested in collecting this data? JIB: based on our user gesture algorithm? I'll look, but can't quite commit resources to this at the moment Conclusion: more info needed Next meeting HTA: probably in April / May <dom> s/Topic: Issue-2495/SubTopic: Issue-2495 <dom> s/Topic: Issue 2502/SubTopic: Issue 2502 Summary of resolutions 1. [12]remove MTI marker on partialFramesLost 2. [13]getTransceivers() SHALL NOT be racy
Received on Monday, 30 March 2020 16:51:01 UTC