- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Tue, 28 Jun 2022 15:30:45 +0200
- To: Harald Alvestrand <harald@alvestrand.no>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Le 28/06/2022 Γ 15:14, Harald Alvestrand a Γ©crit :
> This recording is now on YouTube:
>
> https://youtu.be/c3nah_oMZQs
And the associated minutes are available at:
https://www.w3.org/2022/06/23-webrtc-minutes.html
(copied as text below)
Dom
WebRTC June 2022 VI
23 June 2022
[2]Agenda. [3]IRC log.
[2] https://www.w3.org/2011/04/webrtc/wiki/June_23_2022
[3] https://www.w3.org/2022/06/23-webrtc-irc
Attendees
Present
Bernard, Dom, Eero_Hakkinen, Elad, Florent, Guido,
Harald, Jan-ivar, Patrick_Rockhill, Riju, TimPanton,
TuukkaT, Youenn
Regrets
Carine
Chair
Bernard, HTA, Jan-Ivar
Scribe
dom
Contents
1. [4]WebRTC WG Re-Charter
2. [5]Region Capture Issues
1. [6]Issue [7]#17: the case for making CropTarget Sync
2. [8]Issue [9]#17: the case for making CropTarget Async
3. [10]#17 discussion
4. [11]Issue [12]#18: Is CropTarget name too generic?
5. [13]Issue [14]#63: Cropping non-self-capture tracks
6. [15]Making CropTargets stringifiable
3. [16]Face Detection
4. [17]Summary of resolutions
[7] https://github.com/w3c/mediacapture-extensions/issues/17
[9] https://github.com/w3c/mediacapture-extensions/issues/17
[10] https://github.com/w3c/mediacapture-extensions/issues/17
[12] https://github.com/w3c/mediacapture-extensions/issues/18
[14] https://github.com/w3c/mediacapture-extensions/issues/63
Meeting minutes
Recording: [18]https://youtu.be/c3nah_oMZQs
[18] https://youtu.be/c3nah_oMZQs
IFRAME:
[19]https://www.youtube.com/embed/c3nah_oMZQs?enablejsapi=1&rel
=0&modestbranding=1
[19]
https://www.youtube.com/embed/c3nah_oMZQs?enablejsapi=1&rel=0&modestbranding=1
Slideset: [20]https://lists.w3.org/Archives/Public/www-archive/
2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf
[20]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf
[21]WebRTC WG Re-Charter [22]ποΈ
[21] https://github.com/w3c/webrtc-charter/
[22] https://www.youtube.com/watch?v=c3nah_oMZQs#t=162
[23]Draft updated charter for the WebRTC WG
[23] http://w3c.github.io/webrtc-charter/webrtc-charter.html
[24][Slide 9]
[24]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=9
dom: we need a new charter, and to discuss [25]issue #70
(migrating work to another group)
[25] https://github.com/w3c/webrtc-charter/issues/70
youenn: would this require rechartering Media WG?
dom: not sure, possibly
aboba: we have a joint meeting with Media WG at TPAC
dom: but we can't wait until then to come to a conclusion
youenn: migrating items between charters is always challenging
from our side, in administrative aspecs
bernard: do we need meetings with the media wg to discuss this?
dom: if we want to migrate the work, yes - but we first need to
decide whether we want to do that
bernard: the fact that we have dependencies on videoframe makes
it an interesting question to consider
elad: how busy is the Media WG? would it increase or decrease
the pace of our work?
harald: that would have to be something to discuss with the
chairs
dom: bringing it to media wg would bring more of a media
perspective when this group comes more with a transmission
perspective
harald: so the chairs will discuss this with the Media WG
chairs
dom: no objection from the group in exploring this?
[none]
[26]Region Capture Issues [27]ποΈ
[26] https://github.com/w3c/mediacapture-region
[27] https://www.youtube.com/watch?v=c3nah_oMZQs#t=853
Issue [28]#17: the case for making CropTarget Sync [29]ποΈ
[28] https://github.com/w3c/mediacapture-region/issues/17
[29] https://www.youtube.com/watch?v=c3nah_oMZQs#t=853
[30][Slide 13]
[30]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=13
Jan-Ivar: this is about the cropping API - [31]issue #17 is
about whether it should be sync vs async
β¦ long discussion on the issue - I'll be presenting arguments
why it should be sync
β¦ The TAG design principles include encouragement to use sync
APIs when appropriate, with some exceptions (incl cross-process
communications)
[31] https://github.com/w3c/mediacapture-region/issues/17
[32][Slide 14]
[32]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=14
Jan-Ivar: the API currently in the spec (that doesn't have
consensus) is async
β¦ so you have to `await CropTarget.fromElement(element)`
β¦ I'm proposing it doesn't need to be async,
β¦ the purpose of the operation is associating an identifier
with an element
β¦ as currently specified, it can't fail
β¦ the goal of the crop target is to share it over postMessage
across documents
[33][Slide 15]
[33]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=15
Jan-Ivar: multiple actions needs to happen before we're
cropping anything
β¦ cropTarget can be done ahead of time or later
β¦ if it gets postMessaged to the top-level document, the said
document can offer to the user to crop to that target
β¦ it's only at the end of this process that there is a clear
intent to crop
β¦ UA can optimize this by running some of the underlying tasks
early, but that creates risks in case this doesn't go through
β¦ the complexity of that situation shouldn't be exposed to
developers
[34][Slide 16]
[34]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=16
Jan-Ivar: [35]#48 is asking to allow failure from the minting
process due to resource exhaustion
β¦ which seems linked to optimizations implemented in Chrome
β¦ The issue is that it allows random document to exhaust
cropping resources
β¦ since the API is not gated by permissions - creating DOS
risks
β¦ and may expose apps that don't deal well with that failure
β¦ if resource allocation is moved to the cropTo step, this risk
disappears
β¦ similar to mediaSource.getHandle
[35] https://github.com/w3c/mediacapture-region/issues/48
[36][Slide 17]
[36]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=17
Jan-Ivar: I believe my proposed API is faster, simpler and
still optimizable
β¦ I don't think we need to block on inter-process communication
[37][Slide 18]
[37]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=18
jan-ivar: failing of optimization shouldn't imply a failure of
the operation
β¦ optimizations that influence API design decision tend to
generate further issues
β¦ because optimizations create new side effects
β¦ there is no developer benefits to this API being async
β¦ and there are general developer costs to async APIs - they
create pre-emption points which risk data races
β¦ multiplying failure points for rare error cases is a footgun
β¦ and async is slower as I've shown
elad: the TAG offers design principles, but also meta
principles - not sure it's productive to discuss other browsers
implementation
β¦ the fingerprinting risk is reasonable, but the spec doesn't
force to surface this
β¦ a promise doesn't have to fail in your implementation either
jan-ivar: the API should be designed on principles
β¦ Mozilla is here to push a better API for the Web
Issue [38]#17: the case for making CropTarget Async [39]ποΈ
[38] https://github.com/w3c/mediacapture-region/issues/17
[39] https://www.youtube.com/watch?v=c3nah_oMZQs#t=2085
[40][Slide 20]
[40]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=20
Elad: this API is used in production through origin trial
β¦ we know the API works and makes developers and customers
happy
β¦ we've learned a lot of lessons by implementing and shipping
this
[41][Slide 21]
[41]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=21
Elad: the question is whether minting a token needs to be sync
or async
β¦ I'll explain the Chrome decision and that it doesn't impact
negatively anyone else
[42][Slide 22]
[42]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=22
Elad: our implementation keeps track of which apps have a crop
target
β¦ once it's postMessage'd, this allows Chrome to optimize the
time-to-cropping when the cropTo method is actually invoked
β¦ that makes it simpler and more performant, in particular in
case of CPU congestion in the capturing document
[43][Slide 23]
[43]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=23
[44][Slide 24]
[44]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=24
Elad: Chrome needs this to be async, whether Mozilla prefers it
to be sync
β¦ what's the harm of having an async API?
[45][Slide 25]
[45]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=25
Elad: Priority of consistuencies goes through users,
developers, implementers, spec writers
[46][Slide 26]
[46]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=26
Elad: from a user perspective, seems mostly orthogonal
β¦ from a developer perspective - what we've heard is that they
don't care as it doesn't change much in their huge existing
codebase
β¦ implementors - as an implementor, we see this as an
imperative for us
[47][Slide 27]
[47]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=27
Elad: what is the negative impact then? Is this theoretical
purity?
β¦ given that IPC is involved, async makes sense
[48][Slide 28]
[48]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=28
Elad: the TAG actually insists that theoretical purity doesn't
trump implementers needs
[49][Slide 29]
[49]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=29
Elad: the TAG discussed this API
β¦ they were satisfied with the API
β¦ they haven't seen much ergonomic gain for sync
β¦ they also highlighted that interop should drive the work of
the group
[50][Slide 30]
[50]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=30
Elad: I've shown that consistuencies either don't care about
sync vs async, and that at least one implementor needs async
β¦ also, it's easy to go from async to sync, while the other way
is difficult
[51]#17 discussion [52]ποΈ
[51] https://github.com/w3c/mediacapture-region/issues/17
[52] https://www.youtube.com/watch?v=c3nah_oMZQs#t=2669
Bernard: by making it async, you're saying that this implies
that when you have a cropTarget, you know it's ready to use
β¦ we've had situations in the WebRTC WG where we've found that
sync APIs needed in fact to be async
Elad: cf slide 22
Youenn: I'm surprised you're saying this is MUST - I thought
this was implementable as a sync API, but that you favored the
trade-off that async allows
β¦ as I've pointed out, this trade-off creates a fingerprinting
and interop issue - so a footgun
β¦ so I thought both were implementable but sync would be more
complex in Chrome
Elad: I don't agree with that characterization
Youenn: I'm surprised that both approaches are claiming to be
faster - both can't be true
Youenn: usually sync APIs are more efficient, except when they
creating a blocking situation, which I'm not hearing is the
case here
β¦ a sync API helps developers, at least a bit
Elad: resource allocation design decision is orthogonal to sync
vs async
β¦ I don't think the resource mitigation limits fingerprinting
risks can be mitigated through a per-iframe limit
β¦ in terms of performance, what needs to be fast is cropTo -
anything before that, the user doesn't notice
youenn: sync cropTarget minting is faster, but you're saying
this is not a relevant optimization compared to cropTo
elad: anything that comes before cropTo is irrelevant to user
perceived performance (and mostly negligible in any case)
TimP: as a developer, I have a mild preference to keep it sync
as it's easier to use
β¦ I don't think a developer benefit to making it async
β¦ there may be a user benefit in terms of the crop transition
UX
β¦ that may convince me of the value if it can be shown
β¦ from the developer perspective, managing interesting failures
on obtaining target would also be convincing, but I haven't
heard that
Elad: it's a particular choice of trade-offs that require
async, and async isn't going to harm other consistuencies
TimP: developers will suffer
Elad: I claim it's negligible
TimP: I don't agree - this can generate non-trivial changes,
although it's certainly doable
Jan-Ivar: re no downside - nobody claiming that Chrome
optimizations aren't useful
β¦ I don't understand why these optimizations can't be done with
a sync API
β¦ why can't you implement a fallback?
β¦ other downsides include fingerprinting, DOS, proliferation of
async, failure management for developers
β¦ why does it need to be async?
Harald: code complexity itself is a risk; this particular
implementation has been used and tested
β¦ anyone that depends on cropTo and doesn't notice it failing
as an issue
β¦ it's time to stop this discussion - we have seen that Chrome
claims that a sync implementation would be make it
significantly more complex
β¦ the impact on developers is irritating but not fatal
β¦ we have not seen compelling arguments that we need to change
what has been proposed
β¦ I don't see consensus for change, there is an implementation
of the current spec - I suggest we declare the API to be async
and move on
Jan-Ivar: I hear ease of implementation - complexity in the API
trumps complexity of implementation
β¦ because of the priority of consistuencies
Harald: a more complex API that does the right thing is better
than a simple API that does the wrong thing
Elad: I'm not seeing consensus, but I don't think there are
remaining benefits to discuss this
dom: we could either run a vote, or wait for more
implementation experience
TimP: I would be inclined to say that getting other
implementation is most important
β¦ sync would be more elegant, but it doesn't look like we're
going to get that
Issue [53]#18: Is CropTarget name too generic? [54]ποΈ
[53] https://github.com/w3c/mediacapture-region/issues/18
[54] https://www.youtube.com/watch?v=c3nah_oMZQs#t=4111
[55][Slide 33]
[55]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=33
Youenn: "crop" as at term isn't used too broadly so far, so
probably OK
β¦ not sure that "target" helps
[56][Slide 34]
[56]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=34
[57][Slide 35]
[57]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=35
Youenn: "object whose sole purpose is to be given cropTo" - may
not be limited to elements
β¦ the term CropRegion might work better to represent
β¦ and would align with the spec name ("region capture")
β¦ thoughts?
TimP: the name should reflect that it is a token, not a region
or a target itself
β¦ if it's opaque, it should say so
Youenn: it may not remain opaque
TimP: a region sounds like something you could do math on, e.g.
calculate its surface area
β¦ which you can't
elad: similar reservation - region feels something with
coordinates; also, a cropTarget isn't static, it can move,
which cropregion makes more misleading
harald: this is bikeshedding; I don't see benefit in changing
it
Bernard: +1 that CropRegion is confusing; I prefer the current
name
Youenn: any interest in clarifying the definition (i.e. whether
it's a reference to an element or something more generic)
β¦ I guess that can be done later
RESOLUTION: close [58]#18 without changing the name of
CropTarget
[58] https://github.com/w3c/mediacapture-region/issues/18
Issue [59]#63: Cropping non-self-capture tracks [60]ποΈ
[59] https://github.com/w3c/mediacapture-region/issues/63
[60] https://www.youtube.com/watch?v=c3nah_oMZQs#t=4643
[61][Slide 38]
[61]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=38
Elad: currently you can crop only to current tab
β¦ I suggest we allow cropping arbitrary tabs
[62][Slide 39]
[62]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=39
[63][Slide 40]
[63]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=40
Jan-Ivar: a concern is that it might allow sites to censor
themselves when captured
Elad: the capturing app can ignore crop targets
β¦ in fact, cropping automatically would make no sense
Jan-Ivar: I want the group to be aware of the risk
Elad: but is it likely?
Jan-Ivar: I won't predict the future; I don't see other issues
with this
harald: please write this up in the github issue; I don't
understand the risk
TimP: I support allowing this beyond self-capture
Elad: can we agree that by next meeting we agree to expand this
unless a compelling case is made against it?
Jan-Ivar: imagine a bank wanting to redact what would get
shared over screen capture
β¦ can write this up by next meeting
Harald: I support this too
dom: me too
Making CropTargets stringifiable [64]ποΈ
[64] https://www.youtube.com/watch?v=c3nah_oMZQs#t=5195
[65][Slide 41]
[65]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=41
Elad: making croptargets stringifiable would help e.g. for
communication over capture handle
β¦ not sure I understand the risks
Youenn: a string makes it much more difficult to
garbage-collect a croptarget
Jan-Ivar: +1 to Youenn
β¦ I haven't heard use cases that justify this
β¦ having GCable croptarget is good to keep
youenn: if a croptarget comes with resource allocation, being
able to end these allocations is a good thing
elad: you can't just associate the string to the element, and
when gc'ing the croptarget, remove that association
TimP: this removes the opacity that I relied in my previous
support
β¦ stringifying makes it harder to reason about the safety of
this
elad: the only difference between the two is equality
jan-ivar: there are differences in garbage collection
TimP: from a developer perspective, there is a difference
β¦ there are very limited number of paths to get a cropTarget
β¦ once it's a string, many more paths can be used
Youenn: can you bring that argument to the github?
Elad: would like to get resolution to this; we can skip the
predictable errors
harald: part of the issue seems to be about reconstructing a
croptarget from a string (not about stringifying per se)
[66]Face Detection [67]ποΈ
[66] https://github.com/w3c/mediacapture-extensions/pull/48
[67] https://www.youtube.com/watch?v=c3nah_oMZQs#t=5981
[68]Face Detection explainer
[68] https://github.com/riju/faceDetection/blob/main/explainer.md
[69][Slide 45]
[69]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=45
[70][Slide 46]
[70]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=46
riju: this shows our proposal helps reduce power consumption -
almost 2x at 15fps compared to using TF.js
youenn: the 1st column has no face detection, and the 2nd is
doing face detection in the driver?
tuukka: right
youenn: in some OSes, the two might be equal if the camera is
doing it systematically
riju: indeed, in Android this might be the case
[71][Slide 47]
[71]
https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=47
Riju: having a persistant id is very important for face
tracking
β¦ re keeping the probably optional - sure, but all platforms
provide this
β¦ a developer may use this to decide to apply further
processing (e.g. funny hat)
β¦ but open to making it optional
β¦ Re VideoFrameMetadata, you suggest coordination on WebCodecs?
Youenn: yes, we need to engage with them to find the right
construct
Riju: re API surface, we started with a minimal set, increased
it based on feedback
β¦ but we can re-reduce it for the MVP
β¦ e.g. remove the mesh parts
β¦ the contour was Harald's request
β¦ face landmarks are usually important in post-processing,
think we should keep in MVP
youenn: my point was in terms of priorities & focus
β¦ e.g. for the next 6 months
riju: removing mesh, but keep landmarks
harald: I still have a problem with the API
β¦ the power consumption improvement is nice-to-have
β¦ attachment to the videoframe is nice
β¦ but still unclear what to use it for
β¦ the explainer doesn't help much with it
β¦ what can I do with the output of that API?
β¦ what the MVP would be viable for?
riju: e.g. landmarks would be used for post-processing, e.g
eye-gaze correction
β¦ the platforms only give bounding boxes at this stage
Harald: I would like to a more complete use case
Youenn: one use case is that some encoders optimize based on
specific bounding boxes
Bernard: +1 to youenn - segmentation helps with encoding
Jan-Ivar: the explainer talks about attaching to videoframe,
but the API is still anchored in mediastreamtrack (e.g. for
capabilities)
β¦ how would this API be usable on non-camera sources?
β¦ e.g. on recorded videos
riju: we couldn't use the same platform APIs to get the power
consumption benefits
youenn: I think cameras should be our primary target
β¦ for recorded videos, you could add this through a media
capture transform
jan-ivar: adding these metadata through the transform?
youenn: yes
eero: our proposal has support for setting custom metadata in
videoframe
harald: the constraints are used to instruct the driver to
produce the info, which is then attached to the videoframe
β¦ that makes sense to me
β¦ but writing up the enhanced encoding use case would help
making compelling
riju: any support for prototyping this?
harald: yes - we need to find compelling applications
riju: also heard support from youenn
jan-ivar: I still have some concerns whether this would reveal
difference across platforms
β¦ would suggest raising an issue on Mozilla's standard
positions
bernard: useful to prototype; the metadata discussion should be
brought to the WebCodecs folks
riju: will follow up accordingly
Received on Tuesday, 28 June 2022 13:30:50 UTC