- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Thu, 14 Oct 2021 19:08:02 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Hi, The minutes of our call today are available at: https://www.w3.org/2021/10/14-webrtc-minutes.html (can you spot the new experimental feature of the IRC-based minutes generator?) and copied as text below. Dom WebRTC October 2021 Virtual Interim 14 October 2021 [2]IRC log. [2] https://www.w3.org/2021/10/14-webrtc-irc Attendees Present BernardA, BrianBaldino, Carine, CullenJennings, Dom, EladAlon, Guido, Harald, Jan-Ivar, PatrickRockhill, TimP, Youenn Regrets - Chair bernard, harald, jan-ivar Scribe dom Contents 1. 1. [3]https://lists.w3.org/Archives/Public/www-archive/ 2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf 2. [4]The Streams Pipeline Model (Youennf) 3. [5]Altnerative Mediacapture-transform API 4. [6]Mediacapture Transform API 5. [7]Wrap up and next steps [3] https://www.w3.org/2021/10/14-webrtc-minutes.html Meeting minutes Slideset: [8]https://lists.w3.org/Archives/Public/ www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf [ [9]Slide 8 ] Bernard: [reviewing agenda] [8] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf [9] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=8 The Streams Pipeline Model (Youennf) [ [10]Slide 9 ] [ [11]Slide 10 ] Youenn: this presentation is about topics and issues we discussed with Jan-Ivar when we explored using Streams for media pipelines … goal is to identify blocking issues when looking at adopting streams for media pipelines [ [12]Slide 11 ] Youenn: media pipelines connect sources with sinks … sources are readablestreams and sinks writablestreams … we would want to go from camera to network just using streams … I'll be focusing only on video pipelines … and we'll look at threads and intersection between frames and @@@ [ [13]Slide 12 ] Youenn: dealing with realtime media is better done off the main thread … in the Web Audio API, the graph is done in the main thread but the processing is done in a dedicated audio thread … in our case, there is no dedicated thread … the safest assumption is to asusme the video frames flow where they're set up [ [14]Slide 13 ] Youenn: example 1 is a funny hat example using pipeThrough and pipeTo … it's not clear where the video frames would flow in terms of thread … the assumption would be that it runs in the same thread where these operations are being called … example 2 uses a JS transform … example 3 uses a tee - it makes it very unclear where it would be run, whether the UA would optimize it or not … so the safest assumption, with streams being a generic mechanism, is to assume same-thread [ [15]Slide 14 ] Youenn: one potential related idea is to transfer the stream to a worker … this requires optimizations that are not standard and hard to expose to Web developers … the current implementation in Chrome is also not compliant … it's really hard to predict whether the optimization will kick in or not [ [16]Slide 15 ] Youenn: a few examples - example 1 is the typical example where chrome will optimize after a stream transfer … in example 2 - not clear whether optimization will happen … in example 3 - also unclear … and again in example 4, when using non-camera streams … let's say you transfer an MST to another frame, and then take a stream transfered to a worker - will it be optimized? as a developer, you can never know … as opposed to Web Audio that gives very clear spec'd guarantees [ [17]Slide 16 ] Youenn: streams are a generic tool designed for flexibility - we can't guarantee for performance … we can give that guarantee with transferable MediaStreamTrack … this allows to avoid the issues associated with streams when dealing with realtime streams … additional optimizations can still happen as a bonus, but they're no longer a pre-requisite [ [18]Slide 17 ] Youenn: buffering with streams happens at each transform step in the media pipeline … a typical pipeline is like the one at the top, with greedy processing … but in cases you don't want to process all frames, e.g. a 1-second old frame might be better skipped … as does mediastreamtrackgenerator … the second pipeline illustrates sequential processing which can be beneficial … I think that's a safer approach [ [19]Slide 18 ] Youenn: this is a real issue; videoframe are big and scarce resources … it's also unclear for web developers what happens; buffering is hidden from them … issue-1158 is where this is being described - there is probably a solution that will emerge … but it's unlikely that the default behavior will be the safe behavior for stream of frames [ [20]Slide 19 ] Youenn: in general for streams, the idea is that backpressure will deal with buffering … but for us, some limited buffering might be useful to allow … but it's hard to deal with WHATWG streams … the stream queue is opaque to the application by design … and the queuing strategy is very static, based on the high-water mark … updating the strategy requires resetting your pipeline … WHATWG streams might be able to cover the use case, but with complexity [ [21]Slide 20 ] Youenn: Tee is the typical way to allow multiple consumers with streams … tee is part of the design of the API so we should support it [ [22]Slide 21 ] Youenn: but we know tee is broken when used with our videoframes stream … structured clone might solve this, as suggested in issue 1156 … but the default behavior again won't be the right one for us [ [23]Slide 22 ] Youenn: but even with structured clone, more changes are needed … if you apply structureClone, you add hidden buffering … if the two branches don't consume data at the same pace … issue 1157 discusses this - so far, no clear solution to this … streams by design aren't made to drop items [ [24]Slide 23 ] Youenn: the last issue I want to discuss is lifetime management … streams rely on garbage collection, whereas we don't want to rely on GC for videoframe … there is no easy way to enforce who will close a VideoFrame, making it error prone for Web developers … there is no API contract, so unclear how to solve this <hta> +1 Youenn: maybe a dedicated subclass with built-in memory management? … but no work has started in that direction … if you look at the pipeline - if you change the pipeline, you need to cancel streams … these streams might have buffer, which raises the question of GC again [ [25]Slide 25 ] Youenn: we need to solve these issues, buffering, tee and life management for VideoFrame … there has been progress, but more is needed and it's unclear to me how far we can go [ [26]Slide 27 ] Youenn: having a high level confidence that these issues can be solved before picking it as our model for designing our APis … if we select streams, we should extend support for them in existing and new API (e.g. videodecoder/encoder, barcodedetector) … this doesn't seem to be part of the plans for e.g. WebCodecs Jan-Ivar: a couple of comments … on backpressure, I believe with a transformstream and highwatermark of 0 will automatically call backpressure … wrt dynamic buffering, highwatermark is indeed static, but dynamic buffering can be dealt with a transformstream - but not with a high water mark of 0 Youenn: I'm not optimistic of seeing the problem solved at the source level … my understanding with life time management is that there is no API contract … you don't know if close will be called; I like consistency … memory management would be something we would want to design carefully <hta> I intended to write q+ Bernard: in the current model where we don't have highwatermark Youenn: the camera pool might have 10 video frames; with a 5 steps pipeline, 5 frames will be automatically allocated - this leaves only 5 remaining slots which might not be enough … and some devices might have a smaller buffer of frames … which will create variable framerates bernard: the lack of streams integration in webcodecs creates two queues that need to be managed … and that's not particularly transparent, something you have to keep track of … this can create significant memory management issues … wrapping streams is not particularly satisfactory in our case Harald: a couple of observations … webcodecs did have a stream-based API for a while; MSTP and MSTG was the reason they got dropped … we've had very few people reporting problems with these issues … my impression is that the Stream model has been somewhat confused with the stream shim implementation … we should have a clean model where issues are moved to implementations, not the model … wrt tees, I have some experience with reading the CL that added tee to the spec … worries were expressed that are very similar to ours … tee is a bad design … it's fairly easy to write your own JS to get the tee you want, which is quite dependent on your app … tee doesn't respect the high water mark on down stream - tee is bad … on the contract point, I think it's natural to say that downstream either has to call close, or pass it to something that will call close on VideoFrame … we shouldn't depend on upstream to do anything … we do have an issue with disrupted pipeline - that needs to be solved … my conclusion is that some of these issues are with the description more than implementations, and some are issues we need to solve but aren't fatal … like tee - it's not because it's possible to use it badly that we shouldn't use streams … the streams API is superior to callbacks because it avoid re-doing it all Youenn: I agree with you that tee is bad - salvaging it will be difficult … doing one's tee in JS is indeed better - but you'll end up using promise-based callbacks … but if so, why using streams? … re other issues not being fatal, I would welcome proposals that address these concerns … at the moment, I'm not confident we can proceed with confidence that streams is a good enough match … if they can be solved, I agree that streams are appealing Jan-Ivar: all these issues filed on github are with the model … they're not necessarily huge though, and I'm not sure we should block on them … given that one API is already shipping, I think we need to converge on a standard sooner rather than later Youenn: I'd be interested in getting a pro/cons comparison of promsise callbacks vs streams [10] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=9 [11] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=10 [12] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=11 [13] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=12 [14] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=13 [15] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=14 [16] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=15 [17] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=16 [18] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=17 [19] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=18 [20] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=19 [21] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=20 [22] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=21 [23] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=22 [24] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=23 [25] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=25 [26] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=27 Altnerative Mediacapture-transform API [ [27]Slide 30 ] jib: today, the realtime media pipeline is off main thread today [ [28]Slide 31 ] jib: that remains true in webrtc-encoded-transform … the original chrome APi was on main thread, but we then converged on a standardized API off the main thread … this was importatn for encoded media, all the more so for aw media [ [29]Slide 32 ] jib: the premise here is that the main thread is bad - "overworked & underpaid" as surma qualified during a chrome dev summit in 2019 … surma highlighted webworkers as the solution to that problem … contention on the main thread is common and unpredictable … and hard to detect outside of a controlled environment - as opposed to web workers [ [30]Slide 33 ] jib: when webcodecs made the decision to expose the API on the main thread, they based this on non-realtime media use cases … and they strongly encourage to do realtime processing off the main thread [ [31]Slide 34 ] jib: we have a non-adopted document "mediacapture-transform" (which has shipped in Chrome 94 despite not being standardized) … my position is that this proposal is not satisfactory because it exposes realtime pipeline on main thread by default, it doesn't encourage use in workers, relies on non-standardized optimizations … also, now mediastreamtrack is transferable so this creates new opportunities [ [32]Slide 35 ] [ [33]Slide 36 ] jib: having to ask the main thread all the time to interact with the API makes sense … it's baked in the assumption of main thread hta: that's untrue [ [34]Slide 37 ] jib: for a processed (e.g. background replacement) self-view use case combined with webtransport … tee, clone, postMessage(constraints) aren't good approaches … whereas with track available in a worker, we have a natural API [ [35]Slide 38 ] jib: the tunnel semantics of WHATWG streams are not meant to solve creating streams on the wrong realm … MSTP is built on broken assumptions [ [36]Slide 39 ] jib: I have an alternative proposal based on transferable mediastreamtrack … the proposal focuses on video at the moment … it encourages use on workers … it still uses streams, despite youenn's identified issues - which I think we can find solutions for [ [37]Slide 40 ] jib: we expose a readable attribute in a worker version of the MediaStreamTrack … this keeps data off the main thread [ [38]Slide 41 ] jib: a more complicated example, read & write … this is the equivalent of mediastreamtrackgenerator … we expose only on workers a new VideoTrackSource interface … the example is a crop example inspired from WebCodecs … it aligns better with the separate of source and track of the mediacapture-streams spec … it interacts well with clone and structured cloning [ [39]Slide 42 ] jib: for any video processing, you have a self-view (with high framerate) and a low-fps to send on the network … applyConstraints works well with a peerconnection [ [40]Slide 43 ] jib: now with WebTransport, using track cloning … this shows native downscaling with applyConstraints as a workaround to using tee … not clear how MSTG would let you do this via a worker [ [41]Slide 44 ] jib: benefits: simpler API taking advantage of transferable tracks, with fewer APIs to learn … doesn't block real-time media pipeline by default … it has parity with MSTP & MSTG features … similar in terms of brevity … doesn't rely on UA optimizations … and deal with muted sources [ [42]Slide 45 ] jib: Bonus: if we want promise callbacks for stream-based, you can use "for await" on the stream [ [43]Slide 46 ] jib: if you want more than a readable - this can be done with cloning, but we could also provide dedicated surface Harald: I kind of like the proposal - it's almost totally equivalent to MSTG and MSTP … the examples where you have posting messages to the main thread - MSTG and MSTP are designed to be available to the same contexts where tracks are … MSTG and MSTP will need to be available on workers when MST are … in terms of quoting Chris Needham on the Web Codecs decision - one of the motivation for main thread is the availability of other APIs on the main thread … transfering streams as a pipeline between origin and destination context - it assumes the source is main thread, but that's not true … with a camera, the source of the stream is the camera, not the main thread … otherwise, I like the shape of the API; it's very similar to what I proposed jib: I didn't mean to misrepresent these aspects; I see now that MSTG and MSTP are available in workers … but they're not transferable … so they would have to be created in the worker? harald: yes youenn: re slide 37 … re not using tee because it's bad - I agree, but I hope we should be able to use it … with the example in slide 37, we lose back pressure … we might be able to add it back … in general, in terms of API shape, if we assume that we use streams, this is a good shape that solves some of the issues that I had with the prior proposal … in general, mediacapture-main has concepts of source and track … having a JS object that represent the source is a good thing … similar to a readablestream that can be native or a JS object … I think we should go there, will make it easier to extend the API and remove edge cases … I would prefer not to rely on tranferable streams, but instead rely on transferable MST … which creates a typed way of transferring that can help fulfill the requirements we need jib: my example may have a mistake on which track to clone - would flipping it around fix backpressure? youenn: I don't think so … introducing backpressure on the writablestream might do the trick harald: backpressure cannot deal with framerate [ [44]Slide 47 ] jib: tee can help with backpressure, at the cost of tee problems … the only thing odd is the "createFrameDropper", a transform stream to drop frames … clone/applyConstraint is a work around if we can't solve the tee problem bernard: slide-36 and -37 don't make sense to me jib: right, I wasn't aware that MSTP and MSTG were be available in workers … but you could still do this, and the situation would need to be handled … but Harald is right there is a lot of similarities between two proposals … the advantage is that we don't need to add a new object bernard: re slide 33 … datachannels for instance is only available on the main thread … the lack of consistent API support in workers was part of the challenge jib: MSTG is a bit of an odd duck - it's also track … re lack of APIs, you can always transfer tracks back to the main thread when needed … this doesn't require breaking transferable streams semantics Harald: if you have to tell some place upstream that you're frame is 30, then backpressure can't carry that information … backpressure can't tell the difference between "I'm slightly late" and "I want only every other frame" … we need to be able to carry these signals … we haven't gotten to it yet bernard: there may be several stages of reporting that's needed youenn: this depends on whether sources are push or pull … consumers need to propagate things up to the source … backpressure may not always be the right mechanism, but we need to support it … I also agree we need to fix carry backmessages … the fact that some of the APIs need to be done in the main thread is sad, but it still moves a lot of the heavy processing to workers, leaving only some of the plumbing on the main thread … there may be gaps to do good media processing - if so, we should make them available in workers, and this API would help accelerate that transition Guido: in addition to APIs availability on the main thread, we have first-hand feedback from app developers who WANT to do on main thread for their use cases … otherwise, the two APIs are equivalent beyond their shape dom: re use cases on main thread, is it a matter of developer experience? guido: for certain apps, adding workers in the mix is adding a cost, not a value … it only adds complexity and extra resource consumption jib: even there are such use cases, we're trying to protect a realtime media pipeline [27] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=30 [28] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=31 [29] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=32 [30] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=33 [31] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=34 [32] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=35 [33] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=36 [34] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=37 [35] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=38 [36] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=39 [37] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=40 [38] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=41 [39] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=42 [40] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=43 [41] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=44 [42] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=45 [43] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=46 [44] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=47 Mediacapture Transform API [ [45]Slide 50 ] Harald: [summarizes the API of MSTP and MSTG] [ [46]Slide 51 ] Harald: it shipped in Chrome 94, it's actually used in products with new features based on it … very few problems reported on it [ [47]Slide 52 ] Harald: we believe the threading model is something that app developers need to pick, with encouragement from platform developers … but dictacting it is not the right approach … Streams are transferable objects … adding worker availability to MSTG and MSTP is a reasonable addition following the transferability of MST [ [48]Slide 53 ] Harald: we need to make sure we have samples that show realistic working real-time operations … including offthread processing [ [49]Slide 54 ] Harald: in terms of improvements, we need better control of adaptation source (backpressure, synchronizing streams, framerate) … we need to improve experience with streams that don't come from camera - not trivial to synchronize them … we can work on these aspects once we have agreed on a common base [ [50]Slide 55 ] Harald: the two proposals agree on Streams for frame delivery … difference of opinion for availability on main thread … the proposals differ on whether the generator or the consumer expands MST or use a separate class … this can be discussed … another difference is that MSTG/MSTP is dealing with both audio and video … where jib is focused on video only … clear similarities on model, and distinctions that can be derived in specific issues jib: streams are transferable, but implicit transfer of the source isn't web compatible and we should go away from it harald: my interpretation is that the stream source is NOT on the main thread - e.g. it's attached to the camera jib: the optimizations that chrome has been doing is not compliant to spec AFAICT harald: I haven't been convinced the issue is not with the spec jib: the fact that this can't be optimized all the times would make this head scratching harald: I find the stream spec impossible to navigate - happy to get pointers jib: one of my slide covered the intent of the spec harald: but it relies on the interpretation that the source is in the main thread youenn: the algorithms described in the stream spec will need to be run in the context of the stream (not the source) … there is some leeway in the stream spec to optimize pipethrough et al … but not for the rest afaict … Adam Rice (stream editor) suggested a specific optimizable stream might be needed [ [51]Slide 38 ] jib: [quoting from the spec] … it's explicitly about transfer between realms [52]Transferable streams: the double transfer problem #1063 jib: re exposure to main thread - for webrtc-encoded-transform, we agreed to focus off-thread only harald: I have a bug open to allow to reenable it on main thread … I think this was a bad decision [45] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=50 [46] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=51 [47] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=52 [48] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=53 [49] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=54 [50] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=55 [51] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=38 [52] https://github.com/whatwg/streams/issues/1063 Wrap up and next steps [ [53]Slide 55 ] bernard: I would like to get a sense of the room on the major distinctions between the 2 proposals jib: would also like to get a sense on whether my proposal is acceptable under what changes harald: we have 2 potential starting points, I don't see any reason to pick one over the other youenn: I want to reiterate my concerns about the difficult stream issues that I raised and for which I'm not seeing progress dom: I think the question is about API shape (readable/VideSource vs MSTP/MSTG) Cullen: I don't feel strongly about any of these questions, not knowing enough about the impact on implementations … I would need more background to give an informed opinion Bernard: So, we will bring these questions to the mailing lists Dom: ... after discussions with the chairs Minutes manually created (not a transcript), formatted by [54]scribe.perl version 147 (Thu Jun 24 22:21:39 2021 UTC). [53] https://lists.w3.org/Archives/Public/www-archive/2021Oct/att-0000/WEBRTCWG-2021-10-14.pdf#page=55 [54] https://w3c.github.io/scribe2/scribedoc.html
Received on Thursday, 14 October 2021 17:08:07 UTC