- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Mon, 30 Mar 2020 18:50:55 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi,
The minutes of our call earlier today are available at:
https://www.w3.org/2020/03/30-webrtc-minutes.html
and copied as text below.
Dom
WebRTC Virtual Interim
30 March 2020
[2]IRC log.
[2] https://www.w3.org/2020/03/30-webrtc-irc
Attendees
Present
Bernard, DomHM, Florent, Harald, Henrik, Jan-Ivar,
Jianjun, SamDallstream, TimPanton, Youenn
Regrets
-
Chair
Bernard, Harald, Jan-Ivar
Scribe
dom
Contents
1. [3]WebRTC
1. [4]WebRTC Features at risk
2. [5]ISSUE-2495 When is negotiation complete?
3. [6]ISSUE 2502 When are effects of in-parallel stuff
surfaced?
4. [7]Media Capture and Streams
1. [8]Issue 671 new audio acquisition
2. [9]ISSUE 639 Enforcing user gesture for gUM
5. [10]Next meeting
6. [11]Summary of resolutions
Meeting minutes
WebRTC
WebRTC Features at risk
Bernard: a few unimplemented features not yet marked at risk
… 3 issues filed related to that
… first one is Issue 2496 - the voiceActivityFlag exposed in
SSRC, not implemented anywhere
… any disagreement to marking it at risk?
Henrik: SGTM
Bernard: we have one unimplemented MTI per issue 2497,
partialFramesLost
… should we remove it from the MTI list?
Jan-Ivar: no objection to unmark that one; will we get
implementations for the other ones?
Henrik: they need to be moved from one dictionary to the other
- they've been implemented, they just need to be moved into a
different object
JIB: it's not clear to us yet how easy it will be to implement
in Firefox; pointers to upstream webrtc.org hooks would help
Resolution: remove MTI marker on partialFramesLost
Bernard: last one is multiple DTLS certificates, not
implemented anywhere
HTA: the goal was to help support signed certificates, which is
completely unspecified
Dom: so if we remove support for it, the idea would be to say
the spec only uses the first certificate in the list
TimP: wasn't the background of this support for multiple kind
of certificates?
Bernard: with full support for DTLS 1.2, that's no longer
relevant
Bernard: I'm hearing consensus on all of these
ISSUE-2495 When is negotiation complete?
JIB: this emerged while writing tests for WPT, but is
applicable beyond testing
… "Perfect negotiation" is the pattern we recommend in the spec
that helps abstract away the negotiation from the rest of the
application logic
… having a negotiationended event would help avoid glare,
simplify the logic
… the obvious approach to detect the end of negotiation is racy
… there are workaround, action-specific spin-tests (while
loops)
… but that's bad, leading to timeouts
… I've also tried another workaround by dispatching my own
negotiated event at the exact right time
… this is slightly better, but we can still miss cases
… can we do better? I have 3 proposals
… fire a negotiationcomplete from SRD(answer) if renegotiation
isn't needed
… one downside is that subsequent actions may delay the event
if further negotiations is needed in some edge cases
… Proposal B is a boolean attribute for negotiationneeded -
needs careful implementation in relation to the
negotiationneeded event
… it's also delayed by subsequent actions
… Proposal C: an attribute exposing a promise for
negotiationcomplete
… it's better because it's not delayed by subsequent actions
(by replacing promises as new negotiations get started)
Henrik: compared to proposal A?
JIB: imagine you call addTransceiver-1 & addTransceiver-2, you
have to wait until addTransceiver-2 before the event fires
(which you don't in proposal C)
Henrik: you can build your own logic if you care about partial
negotiations - what you want to know in general is "am I done
or not"?
HTA: I question the question - why should I care if negotiation
is complete?
… what you have here is indeed a problem, but what the app
cares about is whether the transceiver is connected to a live
channel or not
… you don't have this problem with datachannels since you have
an onopen event
… if we want to solve this at all (I would prefer not adding
any API at this point), I think we should look at a signal on
the transceiver availability
Bernard: don't you get that from our existing states, e.g. via
the transports?
Harald: we have it with currentDirection, but without an event,
it has to be polled
JIB: I think apps do need to know whether the transceiver is
ready or not, and having that done with a timeout is not great
HTA: what I'm saying is what matters is the readiness of the
transceiver, not the state of the negotiation
… if we want to add anything here, it should be a
directionchange event to the transceiver
TimP: it could be done with proposal C which indicates "what"
is complete (i.e. which transceiver is ready)
… otherwise, I agree you want to know what it is you got
JIB: you would get that via JS closure
Henrik: I think this is a "nice-to-have" - useful for testing &
debugging; but I think it's a problem that can be solved with
the existing API
JIB: I don't think this can be polyfilled, given that
negotiationneeded is now queued
… negotiationneeded can be queued behind other operations in
the PC
Henrik: you can detect this for each of your negotiated states
by observing which changes are actually reflected (with
different logic for each type of negotiation)
… this would be nicer, but I don't think it's needed
JIB: you mentioned setStreams - it cannot be observed locally
… another advantage of the promise is that it lets you
determine if you're still on the same "negotiation train" by
comparing promises
Youenn: it would be interesting to see if libraries built on
top of PC are implementing that pattern
… this might be a good way to determine its appeal
Henrik: it would be great for debugging for sure, esp in the
age of perfect negotiation
Youenn: so let's wait to see what apps adopting perfect
negotiation before committing to this
Conclusion: keep for post 1.0 (in webrtc-extension?)
ISSUE 2502 When are effects of in-parallel stuff surfaced?
Henrik: the singaling/Transceiver states defined in JSEP and
the API can't be the same to the cost of racy behavior
… which means the requirements imposed by JSEP on these states
create ill-defined / inconsistent behaviors
… Proposals to address this: Proposal A: we make addTrack
dependent only on WebRTC states, not JSEP states
… this is probably what the spec says, not what implementations
do
… Proposal B: we make addTrack depend on a "JSEP transceiver",
but would be racy and create implementation specific behaviors
JIB: I agree there is a race in JSEP
… JSEP was written without thinking about threads at all
… the problem is not really about whether we're in a JS thread
or not
… we have to make copies of things
Henrik: my mental model is that WebRTC JS shallow objects refer
to JSEP objects
… the only problem is with addTrack because of recycling of
transceivers
JIB: the hygienic thing would be to copy state off from JSEP
when looking at transceivers. Is that proposal A?
Henrik: it's implicit in proposal A
JIB: the only problem with that with your example on slide 17 -
this would leave a hole e.g. in the context of perfect
negotiation
Henrik: I think that's a better alternative than starting to
meddle with internal JSEP objects
… the hole here is that if you're unlucky, you need another
round of negotiation
… and in that situation, you would be in a racy scenario in the
first place
HTA: the code of slide 17 is not compatible with perfect
negotation
Henrik: I think proposal A is the only sane approach
HTA: this sounds ready for a pull request
JIB: I think the spec is currently racy given "JSEP in
parallel" so it's more than an informative change
Resolution: getTransceivers() SHALL NOT be racy
Media Capture and Streams
Issue 671 new audio acquisition
Sam: Sam Dallstream, engineer at Microsoft
… this is a feature request / issue on the spec
… at the spec stands today, it is hard to differentiate streams
meant for speech recognition vs communication
… the current implementations are geared towards communication,
which sometimes is at odd with the needs for speech recognition
… e.g. in comms, adding noise can be good, but it hurts with
speech recognition
… slide 22 and 23 shows the differences of needs between the
two usages, extracted from ETSI specs
… the first proposal to address this would be a new constraint
(e.g. "category") that allows to specify "default", "raw",
"communication" "speechRecognition"
… it translates well to existing platforms: windows, iOS,
Android have similar categories
… the problem is that it competes with content-hint in a
confusing way - content-hint is centered around default
constraints AND provide hints to consumer of streams
… whereas this one is setting optimization on the stream itself
(e.g. levels of echo canceling)
… A second proposal is to modify the constraints to make them a
bit more specific, and add a new hint to content-hint
… the advantage is that it fits the current content-hint draft,
with more developer freedom
… but it may be hard to implement though
… Would like to hear if there is consensus on the need, and get
a sense of the direction to fulfill it
Henrik: for clarification, for echoCancellation, it's not
turning it off, it's tweaking it for speech recognition
Sam: right - right now echoCancellation it's a boolean (on or
off)
HTA: but then how does it fit well well with the existing
model?
Sam: I meant it's easier for API consumers, but you're right it
conflicts with other constraints
Bernard: this is not about device selection here
JIB: indeed, most of this is done in software land in any case
Henrik: right, here it's more about disable/enabling feature
JIB: what's the use case that can't be done by gUM-ing & turn
off echoCancellation, gainAutoControl, ambientNoise?
Bernard: it's not on & off
TimP: e.g. in speech interactions, you don't want the voice AI
to hear itself
Sam: Alexa right now turns off everything and then adds their
own optimization for speech recognition
… so this can already be done, but the idea is to allow
built-in optimizations so that not everyone has to do their own
thing
Youenn: do systems provide multiple echo canceller?
… I don't think you can do that in iOS
Sam: that's why the second proposal isn't as straightforward
Henrik: the advantage of these categories is that they vague
enough that implementations can adjust depending on what the
underlying platforms provide
… but then it's not clear exactly what the hint does
HTA: I would expect a lot of variability across platforms in
any case
Henrik: as is the case for echoCancellation: true
HTA: indeed (as the multiple changes of the impl in Chrome
show)
Henrik: it sounds like it is hard-enough to describe,
implementation-specific enough that it should be a hint
JIB: I think that's fair to say that the audio constraints have
been targeted a the communications use case
… not sure how much commitment there is for the purpose of
speech recognition
Sam: right
Henrik: with interop in mind, echoCancellation: true worked
because everyone did their best job at solving it, not doing it
the same thing
… to get that done with this new category, we would need the
same level of commitment and interest from browser vendors
… the alternative is turning everything off and doing post
process in WebAudio/WASM
TimP: another category beyond comm, speech-rec here is
broadcast
… it shouldn't be a two-states switch
JIB: anything here that couldn't be solved with WebAudio /
AudioWorklets
Sam: I would need to take another look at that one
HTA: you would still need a "raw" mode
Youenn: maybe also look at existing open source implementation
of ambient noise and whether they share some rough parameters
Sam: it sounds like leaning towards 2nd proposal
Dom: maybe first also determine what can be done in user land
already with Web Audio / Web Assembly
… if this is already doable there, then maybe we should gain
experience with libraries first
HTA: given we already have a collection of hints in
content-hint that have been found useful, it's kind of easy to
add it there
Bernard: would this applies up to gUM?
HTA: yes, that's already how it works
JIB: if we're thinking adding a new hint, we may need new
constraints specific to speech-recognition
[discussion around feature detection for content-hints]
ISSUE 639 Enforcing user gesture for gUM
Youenn: powerful APIs are nowadays bound to user gesture
… if we were designing gUM today, it would be as well
… but that's not Web compatible to change now
… can we create the conditions to push Web apps to migrate to
that model
… PR 666 proposes to require user gesture to grant access
without a prompt
… I've looked at a few Web sites; whereby.com works with the
restrictions on
… it wouldn't work in Hangout or Meet
… Interested in feedback on the approach and availability to
help with webrtc app developers outreach
Youenn: the end goal would be that calling gUM without user
gesture should be rejected
… user gesture is currently an implementation-dependent
heuristic - this is being worked on
Henrik: I think we would need it to be better defined
… it is also linked to 'user-chooses'
Youenn: the situation is very similar to getDisplayMedia where
Safari applies the user gesture restriction
… it could be the same with gUM
JIB: I like the direction of this; we could describe it as
privacy & security issue
… with feature-policy, there is a privacy escalation pb through
navigation
… jsfiddle allowed all feature policies, so from my site I
could have navigated to my jsfiddle, got priviledged there
before navigating back with an iframe
… so that sounds like an important fix
… the prompting fallback sounds interesting
… denying on page load might be harder to reach
… it's not clear that same-origin navigation should be blocked
Youenn: user gesture definition is still a heuristic, these
could fit into that implementation freedom
HTA: how much legitimate usage would we break?
… before progressing this, we should have a deployed browser
with a counter to detect with/without user gesture
Youenn: Webex and Hangout call it on pageload, so that would
make the counter very high
HTA: so will someone get data?
Youenn: I don't think Safari can do this; would be happy if
someone can do this
… I can reach to top Web site developers
HTA: would anyone at Mozilla interested in collecting this
data?
JIB: based on our user gesture algorithm? I'll look, but can't
quite commit resources to this at the moment
Conclusion: more info needed
Next meeting
HTA: probably in April / May
<dom> s/Topic: Issue-2495/SubTopic: Issue-2495
<dom> s/Topic: Issue 2502/SubTopic: Issue 2502
Summary of resolutions
1. [12]remove MTI marker on partialFramesLost
2. [13]getTransceivers() SHALL NOT be racy
Received on Monday, 30 March 2020 16:51:01 UTC