- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Fri, 26 Jan 2024 15:15:21 +0100
- To: public-webrtc@w3.org
Hi,
Based on Henrik's notes of the meeting last week [1], I generated the
usual W3C style minutes of the Jan 16 meeting at:
https://www.w3.org/2024/01/16-webrtc-minutes.html
Dom
1. https://lists.w3.org/Archives/Public/public-webrtc/2024Jan/0044.html
WebRTC January 2024 Meeting
16 January 2024
[2]Agenda.
[2] https://www.w3.org/2011/04/webrtc/wiki/January_16_2024
Attendees
Present
Bernard, Fippo, Florent, Guido, Harald, Henrik,
Jan-Ivar, Tony, Youenn
Regrets
Dom
Chair
Bernard, Harald, Jan-Ivar
Scribe
Henrik, scribe
Contents
1. [3]WG Document Status
1. [4]WEBRTC-PC:
2. [5]Mediacapture-Streams
3. [6]MST-ContentHint
4. [7]WebRTC-SVC
5. [8]Encoded Transform
6. [9]MediaCapture Transform
2. [10]BLOCKING ISSUES
1. [11]setCodecPreferences vs unidirectional codecs
2. [12]WebRTC spec should explicitly specify all causes
of a PC sourced track being muted
3. [13]General approach to capabilities negotiation
4. [14]Align exposing scalabilityMode with WebRTC
“hardware capabilities” check
5. [15]How does generator.mute change track stats?
6. [16]Is RTCEncodedVideoFrameMetadata.frame_id actually
an unsigned long long or does it wrap at 16 bits?
7. [17]Mark resizeMode, sampleRate and latency as feature
at risk
8. [18]Highly detailed text in video content
9. [19]Comments and request from APA review
3. [20]WebRTC-Extensions: API to control encode complexity
4. [21]Summary of resolutions
Meeting minutes
Slideset: [22]https://lists.w3.org/Archives/Public/www-archive/
2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf
[22]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf
WG Document Status
Bernard: What’s going on with these specs and how can we make
progress. Issues not advancing is a red flag.
WEBRTC-PC:
[23][Slide 11]
[23]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=11
Bernard: How often should we recycle this?
Harald: We have extension specs too that should go from
candidate recommendation to recommendation. It would make sense
to recycle about once per year.
[24][Slide 12]
[24]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=12
Bernard: 50 open issues, 20 open > 1 year. We’re not in great
shape for recycling each year.
Mediacapture-Streams
[25][Slide 13]
[25]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=13
[26][Slide 14]
[26]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=14
Bernard: This is unusual in that it is widely implemented but
only candidate recommendation. 31 open issues, 9 open > 1 year.
It doesn’t look like we’re on the road to proposed
recommendation.
[27][Slide 15]
[27]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=15
Bernard: Are there issues with the WPT tests, surprising
failures.
Jan-Ivar: There are some issues with testing device
infrastructure.
Harald: Is the transferable track errors a sign of feature not
being implemented or a difficulty with testability?
Jan-Ivar/Youenn: We have not implemented this yet.
Youenn: Maybe we should better organize the specs, the
transferable track is an extension.
RESOLUTION: We should move mediacapture extension tests.
MST-ContentHint
[28][Slide 16]
[28]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=16
[29][Slide 17]
[29]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=17
Bernard: This being a working draft seems to be in sync, we
should push to advance. It’s not a huge list to push to CR.
WebRTC-SVC
[30][Slide 18]
[30]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=18
[31][Slide 20]
[31]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=20
Bernard: This is also working draft but it has been implemented
in Chromium. Safari Tech Preview indicate support but it’s not
passing the WPT tests. Is there a Sonoma dependency?
Youenn: I need to look at these tests.
Encoded Transform
[32][Slide 21]
[32]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=21
[33][Slide 22]
[33]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=22
Bernard: This is a working group draft but the test is odd in
that you have some passes for Firefox and Safari but very
little on Edge and Chrome. This is a little bit worrisome, is
this an issue with the spec or implementations?
Youenn: Chrome is passing 23/27 in the tentative folder, the
test in the top folder are following the spec, Firefox/Safari
is implementing ScriptTransform but not SFrameTransform, so
that could be a feature at risk.
Harald: This is a spec problem, we don’t have agreement on a
couple of key features of the spec, so the tests in tentative
reflect the state of implementation before we come to
agreement.
Bernard: So this one seems to have some legitimate issues
keeping it back. But they won’t go away if we don’t talk about
them.
MediaCapture Transform
[34][Slide 23]
[34]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=23
[35][Slide 24]
[35]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=24
[36][Slide 25]
[36]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=25
Bernard: Not much has been happening since October 2022.
Chromium and Sfari Test Preview implements it. 18 open issues,
17 open > 1 year.
Youenn: It’s partially implemented, but like with
ScriptTransform chromium implements a previous version, so this
could be a similar status. There are two API versions in
different browsers.
Bernard: But the functionality is the same, it’s only in the
API shape?
Youenn: Yes.
Harald: The key issue is availability on main thread, and we
have a problem with transferability. Transferring
MediaStreamTrack is not implemented by any browser.
Bernard: So the media stream transform relates to the
implementability of this spec.
Guido: It’s similar but it’s not quite the same as encoded
transform in the sense that chromium is proposing mostly a
superset of what the current spec says, which is basically
availability on window and support on audio. There is one small
difference in API shape which is MediaTrackGenerator, it’s very
similar to the generator that we have in the older version, and
we could very easily make a version that is compatible with the
spec. But the main thing blocking is the transferability of the
track which nobody has implemented.
Youenn: We have a prototype that will probably be available in
Safari Tech Preview in the coming months. Just video, not
audio. It’s not enabled by default or complete yet.
Bernard: In summary, good news: widely implemented specs, but
the specs are lagging behind implementations. It doesn’t seem
like a huge task. But then there are the transforms where there
are real spec issues. So in the next couple of meetings we
should try to make progress on these blocking issues.
BLOCKING ISSUES
[37]setCodecPreferences vs unidirectional codecs
[37] https://github.com/w3c/webrtc-pc/issues/2888
[38][Slide 30]
[38]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=30
Fippo: setCodecPreferences does not directly affect the send
codec but webrtc-pc looks at both send and recv. We could
either…
… Fix webrtc-pc by removing mentions of send codecs.
… Clarify codecs match algorithm.
… Do we agree that we should remove send codec?
[Thumbs up from several people.]
Harald: I’m doing a deep dive in a follow-up issue and would
like to get input.
[39]WebRTC spec should explicitly specify all causes of a PC sourced
track being muted
[39] https://github.com/w3c/webrtc-pc/issues/2915
Jan-Ivar: We’ve discussed mute in a lot of specs, there still
seems to be some doubt about what mute means in webrtc-pc
(remote tracks). Part of the problem is that the
mediastream-main has two jobs, which is defining what
MediaStreamTrack is and to define device capture. But in short,
the mute event is controlled by the user agent and is
controlled by the source. But for WebRTC, the source is an
incoming track.
… WebRTC-PC defines its own set of mute/unmute steps, but there
is lack of clarity if what mediacapture-main says about muting
still applies here or not which is more specific to camera and
microphone, so the question is, does that still apply here?
… The way I read the spec, the definition of WebRTC-PC is the
full description replacing mediacapture-main.
Harald: I think there are situation where it is natural to mute
that is not listed, for example if the network is disconnected.
Or the state of the peer connection goes to disconnected. It
would seem reasonable to mute the tracks.
Youenn: We should get consensus on when mute should happen. I
would try to get consensus on why we mute and we should list
that.
Jan-Ivar: My proposal is that webrtc-pc listed all the reasons.
It should list all reasons. We did this for constraints.
Henrik: I think we should separate the question of where we
define mute reasons, from the question of if all mute reasons
are listed. I agree with Harald that we should mute if the pc
disconnects, but I think webrtc-pc should say this, not
mediacapture-main.
Youenn: I agree and we can add reasons in a PR.
Jan-Ivar: We should focus on what has already been implemented.
Harald: I think we have consensus that for any case where we
agree that browsers should mute, like the BYE, that should be
in the spec. I don’t think we have consensus if it is up to the
user agent to mute at other times.
Jan-Ivar: When mute and unmute happens in different specs could
overlap and cause races.
Youenn: Maybe WebRTC-PC can remain open-ended, so that they are
not open-ended, hopefully nobody is implementing mute for
canvas capture and it would be good that if the spec said so.
Then we could follow up that discussion.
Jan-Ivar: I think the reason for mediacapture-main’s mute being
open-ended was for privacy reasons which may not apply to other
specs. Hopefully other specs don’t need open-endedness so that
we can get implementations to converge.
Youenn: If would be good if we can get a list of reasons why
Chromium might mute.
[40]General approach to capabilities negotiation
[40] https://github.com/w3c/media-capabilities/issues/176
[41][Slide 32]
[41]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=32
Bernard: MediaCapabilities indicate “supported”,
“powerEfficient” and “smooth”.
… PING did a review in March 2021. They liked the
fingerprinting analysis but questioned why we expose device
capabilities for the purpose of negotiation as opposed to
having the user agent negotiate based on capabilities and pick
the one it likes the best. The problem is that this does not
work with the RTC media negotiation model, this sounds more
like a streaming use case model. No progress for years, PING
wants progress.
[42][Slide 33]
[42]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=33
[Bernard is doing PR [43]#212 (see slide) and wants reviews.]
[43] https://github.com/w3c/media-capabilities/issues/212
[44]Align exposing scalabilityMode with WebRTC “hardware
capabilities” check
[44] https://github.com/w3c/webrtc-svc/issues/92
[45][Slide 34]
[45]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=34
[46][Slide 35]
[46]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=35
Bernard: PING also did a review of scalabilityMode and said it
would expose additional fingerprinting surface. But it’s not
enabled in MediaCapabilities.
… Through trial and error you could set scalabilityMode to
check which modes are or are not supported, but it does not
tell if it is hardware or software. Maybe you can figure it out
via performance in getStats.
… The bottom line is that in webrtc-svc you only get a subset
of what is exposed in MediaCapabilities. We also don’t want to
add a hardware check for MC since it it can be used for
streaming use cases.
Henrik: scalabilityMode is a subset of MC, I don’t understand,
did PING say MC is OK or is it that they haven’t had time to
object to MC yet? These issues are entangled so I think we need
to be consistent.
Florent: We need them to understand that the way RTC works on
the Internet. How about we invite them and explain the
situation?
Jan-Ivar: What’s unique with SDP is that it is exposed to
JavaScript, so there is no way not to expose this. But if
MediaCapabilities is not exposed, you could still do a
suboptimal call, so we need to figure out if that is tenable.
We could determine the minimum set of codecs that need to be
exposed, and if those are the same across browsers than it
wouldn’t say much.
Harald: I don’t think we should waste time discussing such
redesign, at least not on this basis. Our current webrtc-pc is
what it is.
Bernard: Codecs tend to come in waves, so really only what
you’re learning is if they have a new device or not, it’s not a
huge privacy risk.
Youenn: We don’t have the same analysis, we think it is a real
issue. As the older devices diminish it will become a very
important fingerprinting.
Bernard: I will continue to work on the privacy analysis.
[47]How does generator.mute change track stats?
[47] https://github.com/w3c/mediacapture-transform/issues/81
[48][Slide 36]
[48]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=36
Bernard: What happens when you mute with the generator
attribute. One option is that you fire the event, or you could
queue a task to fire mute on all of the clones.
Proposal: let’s go with the second option.
RESOLUTION: Let’s go with second option
[49]Is RTCEncodedVideoFrameMetadata.frame_id actually an unsigned
long long or does it wrap at 16 bits?
[49] https://github.com/w3c/webrtc-encoded-transform/issues/220
[50][Slide 37]
[50]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=37
Tony: In chromium this is implemented from the dependency
descriptor which is 16 bit, but unwraps it into a 64 but
unsigned on the receiver side.
… Proposing to keep unsigned long long. frameId is a
monotonically increasing frame counter and its the lower 16
bits will match the frame_number of the DD header extension.
Bernard: Do we care about dependency chains? There could be
circumstances where the dependency is fulfilled. [...] I
support what this slide is saying, we can talk about chains in
a separate issue.
RESOLUTION: Consensus to move forward with Tony’s proposal
[51]Mark resizeMode, sampleRate and latency as feature at risk
[51] https://github.com/w3c/mediacapture-main/issues/958
Jan-Ivar: Some constraints only have one implementation, so the
proposal is to mark them as feature at risk.
Guido: I object because reasizeMode is widely used by people
who use Chromium, some users have even requested additional
resize modes. The other three Chromium implements them to
varying degrees, latency is used particularly on Windows for
users to select capturing with lowest possible capture sizes.
So if we eventually remove them from the spec then we would not
be able to remove them from the web because it would break the
web. Sample size is implemented and exposed by Chromium but I’m
not aware of any use case.
Henrik: I think sampleRate relates to another issue where
people today use SDP munging to change the codec sample rate,
and a possible outcome of that was if it should use the track’s
sample rate, but I’m not sure about the status of that.
Youenn: Maybe we can move these to the mediacapture-exensions
instead of being marked as feature at risk? Or maybe both. But
eventually we may remove them from mediacapture-main
Guido: I think it makes sense for sampleRate, sampleSize and
latency but for resizeMode I think it is important.
Jan-Ivar: Is chromium’s default to automatically downscale?
Guido: Yes. Jan-Ivar: We’re also planning to do this default,
so I’m curious what the remaining use case is for developers to
turn this off.
Guido: Some people want to make sure they get a native
resolution.
RESOLUTION: Move sampleSize, sampleRate and latency to the
extension spec. And then work harder on resizeMode.
[52]Highly detailed text in video content
[52] https://github.com/w3c/mst-content-hint/issues/35
[53][Slide 40]
[53]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=40
Harald: We have some text saying that if you use contentHint in
“text” you activate some flags in AV1. The original PR was
intended to acknowledge that in some scripts, details matter
more than others (see examples). If you downscale those fonts
those would become unreadable earlier than if ASCII be
unreadable.
… Bernard also noted that red text on yellow background will
work worse than black and white if 4:2:0 coding is used and
recommends 4:4:4. We may not want to mandate that due to extra
overhead, but we could…
… Reword addition to note that encoded of colored text may
cause readability issues
… Recommend 4:4:4 if colored text dominates when contentHint
“text” is used.
… Mandate use of 4:4:4 for this case
Youenn: I wouldn’t go with mandating, it is a contentHint, so
I’m all fine with saying “hey user agents please advice…” but
in terms of mandating I’m not sure we will have wording that is
always right so I think that’s too far. So 3 is out.
Bernard: 3 is also out for me, there’s a lot of extra
bandwidth, and is this even supported? Anyway it’s certainly
not prevalent so mandating seems too much. Even recommending
seems pretty high for something like this, it’s almost like
saying that someone who implements AV1 must implement it.
Jan-Ivar: I would also gravitate towards the lower numbered
proposals. In addition there could be an API that is not
hints-based, for example constraints that specify. I’m
reluctant to add new functionality that only acts on a hint on
the track. Perhaps there should be a corresponding API on the
sink instead. That would rule out 2 and 3.
Fippo: We do have 4:4:4 support for H264, but I wouldn’t
recommend it too much, people can codec negotiate all they
want, I’d go for 1.
RESOLUTION: Consensus on proposal 1 (note, not recommend).
[54]Comments and request from APA review
[54] https://github.com/w3c/mst-content-hint/issues/55
[55][Slide 41]
[55]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=41
Harald: The APA has reviewed contentHint. See slides for issues
that we may or may not need to address.
… We don’t do links between things at higher level. I suggest
saying that in the track model a track is a track is a track.
Things that link tracks together need to be specified on a
higher level. We should not have regions on videos. I think in
general we should reject.
Bernard: I think in general this is not a problem for the MST
contentHint spec, but there are things in the media capture
working group worth discussing. There may be some regulation
that applies to some of this.
Harald: But these things can be addressed on a higher layer,
for example I just turned on CC so it is possible to have a
separate track with subtitles.
Bernard: I’m just worried that APA gets ignored like PING,
maybe we should have a joint meeting.
Harald: We could do this at TPAC if we get the right people
into the same room.
Harald to draft a reply.
[56]WebRTC-Extensions: API to control encode complexity
[56] https://github.com/w3c/webrtc-extensions/issues/191
[57][Slide 45]
[57]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=45
Florent: We want to be able to optimize the tradeoff between
device resource usage and compression efficiency for different
use cases. Affecting CPU, video bitrate and quality.
… We looked at similar APIs, in Android Media it’s a 0-9
integer, on Azure Media Services it’s “speed, balanced,
quality”, in x264 (an H264 library) there is a wide range of
presets from ultrafast to veryslow.
… The actual results could vary depending on the codec or
specific encoder used and are not meant to be fixed by the
specification. But we expect on average encode time and QP is
affected as per slide depending on a low, normal or high
complexity mode.
[58][Slide 46]
[58]
https://lists.w3.org/Archives/Public/www-archive/2024Jan/att-0002/WEBRTCWG-2024-01-16.pdf#page=46
Youenn: Sometimes compressing more is better for battery due to
using less battery, how does an app decide what is a good
decision and does web applications already have this
information or is it not available? Have you considered
specifying if you prefer battery or quality as an alternative
API shape?
Florent: There are different options that could be discussed. I
think it is important to have a per stream setting, for example
the presentation is more important than the face.
Youenn: I wonder how the page could know which setting to use.
Florent: It’s more about knowing what the stream is used for.
Driven by use case. For example thumbnail is less important.
But the app could also monitor encode time in getStats and use
that to decide.
Jan-Ivar: I was going to say we’re going from user agent which
has a lot of information to the web application which has less
information so it could make it worse, but you make a good case
that it can know that one stream is more important than another
stream. So I just have a bikeshed question on the naming. But
I’m also concerned if a web app just asks for high quality
across the board.
Florent: I’m not opposed of changing the names. This is more to
say if this is more important or less important. This is mostly
about CPU time allocated to encode. It’s very common in other
APIs.
Jan-Ivar: Would medium or middle mean that user agent decide?
Florent: Yes the user agent decides, but you can also have the
web application tell the user agent to
Jan-Ivar: It might be better to have the default be unset.
Fippo: Should this also exist for audio? It sounds a lot like a
setting that exists in opus where there is a value between 1
and 10.
Florent: I’m not opposed, but it would be per browser and codec
dependent, so it’s more like a hint to the browser. But there
is nothing preventing us from doing this for audio as well.
Youenn: The user agent is still doing degradation and
adaptation, so this sounds more like a priority between streams
rather than CPU or QP.
Florent: But if you use less time that would affect the QP.
Henrik: I don’t think this is just about priority between
streams - it’s that too, but I think even for a single stream
you could have one use case where you only care about bitrate
but another use case where it’s all about quality. Right?
Florent: Yes.
Bernard: What about upper or lower bounds? Is there a limit?
Can it affect jitter buffer?
Florent: It’s still WebRTC deciding, it’s up to the user agent
that the impact is minimal.
Harald: The control knob should be specified on encode, not
priority, we already have priority APIs. I don’t much care
about the name but it needs to be specific to encoding.
Bernard: Do we have consensus we want to go ahead with this?
Youenn: I think it’s worth exploring.
[Fippo gives thumbs up.]
[Florent will come up with a PR so we can iterate]
Summary of resolutions
1. [59]We should move mediacapture extension tests.
2. [60]Let’s go with second option
3. [61]Consensus to move forward with Tony’s proposal
4. [62]Move sampleSize, sampleRate and latency to the
extension spec. And then work harder on resizeMode.
5. [63]Consensus on proposal 1 (note, not recommend).
Received on Friday, 26 January 2024 14:15:26 UTC