- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Tue, 28 Jun 2022 15:30:45 +0200
- To: Harald Alvestrand <harald@alvestrand.no>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Le 28/06/2022 Γ 15:14, Harald Alvestrand a Γ©crit : > This recording is now on YouTube: > > https://youtu.be/c3nah_oMZQs And the associated minutes are available at: https://www.w3.org/2022/06/23-webrtc-minutes.html (copied as text below) Dom WebRTC June 2022 VI 23 June 2022 [2]Agenda. [3]IRC log. [2] https://www.w3.org/2011/04/webrtc/wiki/June_23_2022 [3] https://www.w3.org/2022/06/23-webrtc-irc Attendees Present Bernard, Dom, Eero_Hakkinen, Elad, Florent, Guido, Harald, Jan-ivar, Patrick_Rockhill, Riju, TimPanton, TuukkaT, Youenn Regrets Carine Chair Bernard, HTA, Jan-Ivar Scribe dom Contents 1. [4]WebRTC WG Re-Charter 2. [5]Region Capture Issues 1. [6]Issue [7]#17: the case for making CropTarget Sync 2. [8]Issue [9]#17: the case for making CropTarget Async 3. [10]#17 discussion 4. [11]Issue [12]#18: Is CropTarget name too generic? 5. [13]Issue [14]#63: Cropping non-self-capture tracks 6. [15]Making CropTargets stringifiable 3. [16]Face Detection 4. [17]Summary of resolutions [7] https://github.com/w3c/mediacapture-extensions/issues/17 [9] https://github.com/w3c/mediacapture-extensions/issues/17 [10] https://github.com/w3c/mediacapture-extensions/issues/17 [12] https://github.com/w3c/mediacapture-extensions/issues/18 [14] https://github.com/w3c/mediacapture-extensions/issues/63 Meeting minutes Recording: [18]https://youtu.be/c3nah_oMZQs [18] https://youtu.be/c3nah_oMZQs IFRAME: [19]https://www.youtube.com/embed/c3nah_oMZQs?enablejsapi=1&rel =0&modestbranding=1 [19] https://www.youtube.com/embed/c3nah_oMZQs?enablejsapi=1&rel=0&modestbranding=1 Slideset: [20]https://lists.w3.org/Archives/Public/www-archive/ 2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf [20] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf [21]WebRTC WG Re-Charter [22]ποΈ [21] https://github.com/w3c/webrtc-charter/ [22] https://www.youtube.com/watch?v=c3nah_oMZQs#t=162 [23]Draft updated charter for the WebRTC WG [23] http://w3c.github.io/webrtc-charter/webrtc-charter.html [24][Slide 9] [24] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=9 dom: we need a new charter, and to discuss [25]issue #70 (migrating work to another group) [25] https://github.com/w3c/webrtc-charter/issues/70 youenn: would this require rechartering Media WG? dom: not sure, possibly aboba: we have a joint meeting with Media WG at TPAC dom: but we can't wait until then to come to a conclusion youenn: migrating items between charters is always challenging from our side, in administrative aspecs bernard: do we need meetings with the media wg to discuss this? dom: if we want to migrate the work, yes - but we first need to decide whether we want to do that bernard: the fact that we have dependencies on videoframe makes it an interesting question to consider elad: how busy is the Media WG? would it increase or decrease the pace of our work? harald: that would have to be something to discuss with the chairs dom: bringing it to media wg would bring more of a media perspective when this group comes more with a transmission perspective harald: so the chairs will discuss this with the Media WG chairs dom: no objection from the group in exploring this? [none] [26]Region Capture Issues [27]ποΈ [26] https://github.com/w3c/mediacapture-region [27] https://www.youtube.com/watch?v=c3nah_oMZQs#t=853 Issue [28]#17: the case for making CropTarget Sync [29]ποΈ [28] https://github.com/w3c/mediacapture-region/issues/17 [29] https://www.youtube.com/watch?v=c3nah_oMZQs#t=853 [30][Slide 13] [30] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=13 Jan-Ivar: this is about the cropping API - [31]issue #17 is about whether it should be sync vs async β¦ long discussion on the issue - I'll be presenting arguments why it should be sync β¦ The TAG design principles include encouragement to use sync APIs when appropriate, with some exceptions (incl cross-process communications) [31] https://github.com/w3c/mediacapture-region/issues/17 [32][Slide 14] [32] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=14 Jan-Ivar: the API currently in the spec (that doesn't have consensus) is async β¦ so you have to `await CropTarget.fromElement(element)` β¦ I'm proposing it doesn't need to be async, β¦ the purpose of the operation is associating an identifier with an element β¦ as currently specified, it can't fail β¦ the goal of the crop target is to share it over postMessage across documents [33][Slide 15] [33] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=15 Jan-Ivar: multiple actions needs to happen before we're cropping anything β¦ cropTarget can be done ahead of time or later β¦ if it gets postMessaged to the top-level document, the said document can offer to the user to crop to that target β¦ it's only at the end of this process that there is a clear intent to crop β¦ UA can optimize this by running some of the underlying tasks early, but that creates risks in case this doesn't go through β¦ the complexity of that situation shouldn't be exposed to developers [34][Slide 16] [34] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=16 Jan-Ivar: [35]#48 is asking to allow failure from the minting process due to resource exhaustion β¦ which seems linked to optimizations implemented in Chrome β¦ The issue is that it allows random document to exhaust cropping resources β¦ since the API is not gated by permissions - creating DOS risks β¦ and may expose apps that don't deal well with that failure β¦ if resource allocation is moved to the cropTo step, this risk disappears β¦ similar to mediaSource.getHandle [35] https://github.com/w3c/mediacapture-region/issues/48 [36][Slide 17] [36] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=17 Jan-Ivar: I believe my proposed API is faster, simpler and still optimizable β¦ I don't think we need to block on inter-process communication [37][Slide 18] [37] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=18 jan-ivar: failing of optimization shouldn't imply a failure of the operation β¦ optimizations that influence API design decision tend to generate further issues β¦ because optimizations create new side effects β¦ there is no developer benefits to this API being async β¦ and there are general developer costs to async APIs - they create pre-emption points which risk data races β¦ multiplying failure points for rare error cases is a footgun β¦ and async is slower as I've shown elad: the TAG offers design principles, but also meta principles - not sure it's productive to discuss other browsers implementation β¦ the fingerprinting risk is reasonable, but the spec doesn't force to surface this β¦ a promise doesn't have to fail in your implementation either jan-ivar: the API should be designed on principles β¦ Mozilla is here to push a better API for the Web Issue [38]#17: the case for making CropTarget Async [39]ποΈ [38] https://github.com/w3c/mediacapture-region/issues/17 [39] https://www.youtube.com/watch?v=c3nah_oMZQs#t=2085 [40][Slide 20] [40] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=20 Elad: this API is used in production through origin trial β¦ we know the API works and makes developers and customers happy β¦ we've learned a lot of lessons by implementing and shipping this [41][Slide 21] [41] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=21 Elad: the question is whether minting a token needs to be sync or async β¦ I'll explain the Chrome decision and that it doesn't impact negatively anyone else [42][Slide 22] [42] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=22 Elad: our implementation keeps track of which apps have a crop target β¦ once it's postMessage'd, this allows Chrome to optimize the time-to-cropping when the cropTo method is actually invoked β¦ that makes it simpler and more performant, in particular in case of CPU congestion in the capturing document [43][Slide 23] [43] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=23 [44][Slide 24] [44] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=24 Elad: Chrome needs this to be async, whether Mozilla prefers it to be sync β¦ what's the harm of having an async API? [45][Slide 25] [45] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=25 Elad: Priority of consistuencies goes through users, developers, implementers, spec writers [46][Slide 26] [46] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=26 Elad: from a user perspective, seems mostly orthogonal β¦ from a developer perspective - what we've heard is that they don't care as it doesn't change much in their huge existing codebase β¦ implementors - as an implementor, we see this as an imperative for us [47][Slide 27] [47] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=27 Elad: what is the negative impact then? Is this theoretical purity? β¦ given that IPC is involved, async makes sense [48][Slide 28] [48] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=28 Elad: the TAG actually insists that theoretical purity doesn't trump implementers needs [49][Slide 29] [49] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=29 Elad: the TAG discussed this API β¦ they were satisfied with the API β¦ they haven't seen much ergonomic gain for sync β¦ they also highlighted that interop should drive the work of the group [50][Slide 30] [50] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=30 Elad: I've shown that consistuencies either don't care about sync vs async, and that at least one implementor needs async β¦ also, it's easy to go from async to sync, while the other way is difficult [51]#17 discussion [52]ποΈ [51] https://github.com/w3c/mediacapture-region/issues/17 [52] https://www.youtube.com/watch?v=c3nah_oMZQs#t=2669 Bernard: by making it async, you're saying that this implies that when you have a cropTarget, you know it's ready to use β¦ we've had situations in the WebRTC WG where we've found that sync APIs needed in fact to be async Elad: cf slide 22 Youenn: I'm surprised you're saying this is MUST - I thought this was implementable as a sync API, but that you favored the trade-off that async allows β¦ as I've pointed out, this trade-off creates a fingerprinting and interop issue - so a footgun β¦ so I thought both were implementable but sync would be more complex in Chrome Elad: I don't agree with that characterization Youenn: I'm surprised that both approaches are claiming to be faster - both can't be true Youenn: usually sync APIs are more efficient, except when they creating a blocking situation, which I'm not hearing is the case here β¦ a sync API helps developers, at least a bit Elad: resource allocation design decision is orthogonal to sync vs async β¦ I don't think the resource mitigation limits fingerprinting risks can be mitigated through a per-iframe limit β¦ in terms of performance, what needs to be fast is cropTo - anything before that, the user doesn't notice youenn: sync cropTarget minting is faster, but you're saying this is not a relevant optimization compared to cropTo elad: anything that comes before cropTo is irrelevant to user perceived performance (and mostly negligible in any case) TimP: as a developer, I have a mild preference to keep it sync as it's easier to use β¦ I don't think a developer benefit to making it async β¦ there may be a user benefit in terms of the crop transition UX β¦ that may convince me of the value if it can be shown β¦ from the developer perspective, managing interesting failures on obtaining target would also be convincing, but I haven't heard that Elad: it's a particular choice of trade-offs that require async, and async isn't going to harm other consistuencies TimP: developers will suffer Elad: I claim it's negligible TimP: I don't agree - this can generate non-trivial changes, although it's certainly doable Jan-Ivar: re no downside - nobody claiming that Chrome optimizations aren't useful β¦ I don't understand why these optimizations can't be done with a sync API β¦ why can't you implement a fallback? β¦ other downsides include fingerprinting, DOS, proliferation of async, failure management for developers β¦ why does it need to be async? Harald: code complexity itself is a risk; this particular implementation has been used and tested β¦ anyone that depends on cropTo and doesn't notice it failing as an issue β¦ it's time to stop this discussion - we have seen that Chrome claims that a sync implementation would be make it significantly more complex β¦ the impact on developers is irritating but not fatal β¦ we have not seen compelling arguments that we need to change what has been proposed β¦ I don't see consensus for change, there is an implementation of the current spec - I suggest we declare the API to be async and move on Jan-Ivar: I hear ease of implementation - complexity in the API trumps complexity of implementation β¦ because of the priority of consistuencies Harald: a more complex API that does the right thing is better than a simple API that does the wrong thing Elad: I'm not seeing consensus, but I don't think there are remaining benefits to discuss this dom: we could either run a vote, or wait for more implementation experience TimP: I would be inclined to say that getting other implementation is most important β¦ sync would be more elegant, but it doesn't look like we're going to get that Issue [53]#18: Is CropTarget name too generic? [54]ποΈ [53] https://github.com/w3c/mediacapture-region/issues/18 [54] https://www.youtube.com/watch?v=c3nah_oMZQs#t=4111 [55][Slide 33] [55] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=33 Youenn: "crop" as at term isn't used too broadly so far, so probably OK β¦ not sure that "target" helps [56][Slide 34] [56] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=34 [57][Slide 35] [57] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=35 Youenn: "object whose sole purpose is to be given cropTo" - may not be limited to elements β¦ the term CropRegion might work better to represent β¦ and would align with the spec name ("region capture") β¦ thoughts? TimP: the name should reflect that it is a token, not a region or a target itself β¦ if it's opaque, it should say so Youenn: it may not remain opaque TimP: a region sounds like something you could do math on, e.g. calculate its surface area β¦ which you can't elad: similar reservation - region feels something with coordinates; also, a cropTarget isn't static, it can move, which cropregion makes more misleading harald: this is bikeshedding; I don't see benefit in changing it Bernard: +1 that CropRegion is confusing; I prefer the current name Youenn: any interest in clarifying the definition (i.e. whether it's a reference to an element or something more generic) β¦ I guess that can be done later RESOLUTION: close [58]#18 without changing the name of CropTarget [58] https://github.com/w3c/mediacapture-region/issues/18 Issue [59]#63: Cropping non-self-capture tracks [60]ποΈ [59] https://github.com/w3c/mediacapture-region/issues/63 [60] https://www.youtube.com/watch?v=c3nah_oMZQs#t=4643 [61][Slide 38] [61] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=38 Elad: currently you can crop only to current tab β¦ I suggest we allow cropping arbitrary tabs [62][Slide 39] [62] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=39 [63][Slide 40] [63] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=40 Jan-Ivar: a concern is that it might allow sites to censor themselves when captured Elad: the capturing app can ignore crop targets β¦ in fact, cropping automatically would make no sense Jan-Ivar: I want the group to be aware of the risk Elad: but is it likely? Jan-Ivar: I won't predict the future; I don't see other issues with this harald: please write this up in the github issue; I don't understand the risk TimP: I support allowing this beyond self-capture Elad: can we agree that by next meeting we agree to expand this unless a compelling case is made against it? Jan-Ivar: imagine a bank wanting to redact what would get shared over screen capture β¦ can write this up by next meeting Harald: I support this too dom: me too Making CropTargets stringifiable [64]ποΈ [64] https://www.youtube.com/watch?v=c3nah_oMZQs#t=5195 [65][Slide 41] [65] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=41 Elad: making croptargets stringifiable would help e.g. for communication over capture handle β¦ not sure I understand the risks Youenn: a string makes it much more difficult to garbage-collect a croptarget Jan-Ivar: +1 to Youenn β¦ I haven't heard use cases that justify this β¦ having GCable croptarget is good to keep youenn: if a croptarget comes with resource allocation, being able to end these allocations is a good thing elad: you can't just associate the string to the element, and when gc'ing the croptarget, remove that association TimP: this removes the opacity that I relied in my previous support β¦ stringifying makes it harder to reason about the safety of this elad: the only difference between the two is equality jan-ivar: there are differences in garbage collection TimP: from a developer perspective, there is a difference β¦ there are very limited number of paths to get a cropTarget β¦ once it's a string, many more paths can be used Youenn: can you bring that argument to the github? Elad: would like to get resolution to this; we can skip the predictable errors harald: part of the issue seems to be about reconstructing a croptarget from a string (not about stringifying per se) [66]Face Detection [67]ποΈ [66] https://github.com/w3c/mediacapture-extensions/pull/48 [67] https://www.youtube.com/watch?v=c3nah_oMZQs#t=5981 [68]Face Detection explainer [68] https://github.com/riju/faceDetection/blob/main/explainer.md [69][Slide 45] [69] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=45 [70][Slide 46] [70] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=46 riju: this shows our proposal helps reduce power consumption - almost 2x at 15fps compared to using TF.js youenn: the 1st column has no face detection, and the 2nd is doing face detection in the driver? tuukka: right youenn: in some OSes, the two might be equal if the camera is doing it systematically riju: indeed, in Android this might be the case [71][Slide 47] [71] https://lists.w3.org/Archives/Public/www-archive/2022Jun/att-0006/WEBRTCWG-2022-06-23.pdf#page=47 Riju: having a persistant id is very important for face tracking β¦ re keeping the probably optional - sure, but all platforms provide this β¦ a developer may use this to decide to apply further processing (e.g. funny hat) β¦ but open to making it optional β¦ Re VideoFrameMetadata, you suggest coordination on WebCodecs? Youenn: yes, we need to engage with them to find the right construct Riju: re API surface, we started with a minimal set, increased it based on feedback β¦ but we can re-reduce it for the MVP β¦ e.g. remove the mesh parts β¦ the contour was Harald's request β¦ face landmarks are usually important in post-processing, think we should keep in MVP youenn: my point was in terms of priorities & focus β¦ e.g. for the next 6 months riju: removing mesh, but keep landmarks harald: I still have a problem with the API β¦ the power consumption improvement is nice-to-have β¦ attachment to the videoframe is nice β¦ but still unclear what to use it for β¦ the explainer doesn't help much with it β¦ what can I do with the output of that API? β¦ what the MVP would be viable for? riju: e.g. landmarks would be used for post-processing, e.g eye-gaze correction β¦ the platforms only give bounding boxes at this stage Harald: I would like to a more complete use case Youenn: one use case is that some encoders optimize based on specific bounding boxes Bernard: +1 to youenn - segmentation helps with encoding Jan-Ivar: the explainer talks about attaching to videoframe, but the API is still anchored in mediastreamtrack (e.g. for capabilities) β¦ how would this API be usable on non-camera sources? β¦ e.g. on recorded videos riju: we couldn't use the same platform APIs to get the power consumption benefits youenn: I think cameras should be our primary target β¦ for recorded videos, you could add this through a media capture transform jan-ivar: adding these metadata through the transform? youenn: yes eero: our proposal has support for setting custom metadata in videoframe harald: the constraints are used to instruct the driver to produce the info, which is then attached to the videoframe β¦ that makes sense to me β¦ but writing up the enhanced encoding use case would help making compelling riju: any support for prototyping this? harald: yes - we need to find compelling applications riju: also heard support from youenn jan-ivar: I still have some concerns whether this would reveal difference across platforms β¦ would suggest raising an issue on Mozilla's standard positions bernard: useful to prototype; the metadata discussion should be brought to the WebCodecs folks riju: will follow up accordingly
Received on Tuesday, 28 June 2022 13:30:50 UTC