Minutes of the July 22, 2021 WEBRTC WG Virtual Interim from Bernard Aboba on 2021-08-10 (public-webrtc@w3.org from August 2021)

From: Bernard Aboba <Bernard.Aboba@microsoft.com>
Date: Tue, 10 Aug 2021 14:54:36 +0000
To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <PH0PR00MB099940DB5250B9BA195B4EB4ECF79@PH0PR00MB0999.namprd00.prod.outlook.com>

W3C WebRTC WG Meeting - July 22, 2021

Notes - Tony Herre

Slides<https://docs.google.com/presentation/d/1g_M80kQAnx0IPGX80ramgNJY28Xd91RE8mXsRThXvLM/edit#slide=id.ge4a901ebf4_0_0>

Recording - [to come]

* Introduction

* MediaStreamTrack Transfer

* Youenn Fablet: May meeting had alignment that moving forward with the Transferable option

* Bernard Aboba: clarification: Editor Drafts do not have consensus. Merging a PR or a presentation at a meeting does not change that. Consensus is confirmed by a Call for Consensus (CfC) on the mailing list to promote an Editor’s Draft to a WG Document. Currently neither mediacapture-extensions nor WebRTC-Extensions have had a CfC to promote them to WG Drafts. So neither document has WG consensus.

* Youenn Fablet: suggestions of ways to move forward, step1 sources ties to creation context, step 2: clarify Transfer behaviour

* Discussion:

* Harald Alvestrand: re PR 805, uncertain over making a change to a document (media-capture main) we hoped was finalized, but aligning this ownership to context makes sense, glad to make it explicit. Definition of context may need input from others with more HTML-expertise. In favour of adopting WG draft, but clarify each proposal within it needs independent consensus - using notation within the document.

* Jan-Ivar: Shared concerns re clarity if consensus, in favour of consensus calls on parts. Happy to move forward with merging both PR 805 and PR 30 on extensions.

* Carine: we typically use notes inside drafts for things that need particular feedback

* Jan-Ivar Bruaroey: Suggestion that we use the term "no objections" for meetings rather than consensus for clarity.

* Bernard: Conclusion: At least 2 in favour to merging PRs 805 and extensions 30.

* Conclusion: Will run a week CfC on both PR 805 and PR 30 separately.

* Mediacapture-transform Callback Proposal

* Youenn: [slides]

* Bernard: Re promises providing backpressure. What happens if a transform takes too long? Is it possible to ask for a queue of more than 1 to avoid losing a frame if processing takes slightly longer?

* Youenn: Cloning is a ref counted object, can clone on every call. The application then needs to manage this queue, decrease the framerate etc according to what is best for themselves

* Harald: Disagree with slide #23: MediaStreamTrack is a control interface, not a stream. Original proposal for Breakout Box had 3 levels of taking ownership to build up to control, but we didn't have time to discuss and build out a spec based on this. Creating a Track using this proposal implements most of Streams, with Control on top. What usecases are there for taking this level of control?

* Youenn: Goal 2 to transform a stream and send it on, adding funny hats is the case. MediaStreamTrackGenerator blocks the muted state, which apps are using.

* Harald: Are you saying that having both frame handling and control signals in an API makes Callbacks better than Streams?

* Youenn: This is not the claim.

* Thomas Guilbert: Work on the WebCodec team, wanted to clarify that WebCodec initially moved away from streams but then moved back to streams and Breakout Box. requestVideoFrameCallback will make sure it is run and has the subtleties that it will always have the freshest frame,

* Youenn: WebAudio and AudioWorklet don't guarantee every chunk is processed, but in practise it is true. The same is true here, and in the case where processing is backed up, this seems to be best. Flexibility over buffer size pool

* Jan-Ivar: Like Transferrable MST and only exposes on Worker - important to Mozilla. Like API shapes but this are an aside. Feels like a Goal 3: using external JS sinks. Agree MediaStreamTrack and Streams aren't the same thing, and that Streams don't aim to solve control flows. Dislike the callback parts over Streams: Streamlining of callbacks, using a promise which is equivalent to a push source ReadableStream, and backpressure this leaves JS responsible for propagating received backpressure backwards. Streams is the standard approach across. Have synchronous callbacks elsewhere, asynchronous here and streams - don't need 3 mechanisms.

* Youenn: Read and PipeTo lose the clear handling of VideoFrame lifecycles. Piping to WebTransport doesn't require streams, as WebCodecs isn't using it in the middle. Handing a VideoFrame to WebCodecs hands ownership over.

* Continue this discussion on github

* Guido Urdaneta: Comparison between callback and streams on slide 18 misses some details (and the streams example could be cleaner using piping) : strongly coupling the track to the flow of data when transfering has the problems with losing access to control surfaces on the window to eg stop or show a self view.

* Youenn: This is a question of MediaStreamTrack transferability.

* Guido: Needs to be considered as a proposal as a whole. One could transfer callbacks back, which is indeed what Streams do. MediaStreamTrack model as a control surface handles this separation well.

* Youenn: Like to continue discussion on an issue

* Bernard: How would you implement the default behaviour to send black on mute. Does the application need to set a timer to send a black frame on no callbacks

* Youenn: Will answer offline. Bernard will file an issue

* Screen Capture

* Issues 182 and 158

* Youenn: Agree with analysis on tab capture not great. User confusion between tab and web page, or UA support? Prefer making Tab Capture safer by muting on navigation etc.

* Jan-Ivar: once we have the isolation value, extrapolate into getViewportMedia

* Elad Alon: We agree on where we want to get, not on how to get there. Dangers in not pushing forward with the current solution, given users have non-web alternatives. Unsure why we're tying communication between capturer and capturee with this. Suspect we'll need to handle non-site isolated presentations for a long time due to sites taking time and we need to allow users to present these.

* Jan-Ivar: iterative proposals exist, but I'm not presenting them today. Would like to get agreement on this future work. Didn't intend to imply that other APIs are blocked on this. How do we iterate is a valid question.

* Elad Alon: There is a risk that agreeing to this would mean there's less interest in the iterative process, given previous conversations.

* Jan-Ivar: previous responses are based on principles rather than this long-term view.

* Can we agree on moving forward with defining these site isolation concepts, assuming we can agree on cropping [later]?

* Elad: in favour of this direction

* Harald: This is likely a wider issue than the WebRTC WG - what sites we can give elevated permissions too. Concerned that the perfect may be the enemy of the good. Willing to get these specified but not on what should be blocked on this.

* Carine: Agree on the scope, site isolation needs input from security and TAG also

* Jan-Ivar: Yes, there are others to involve. Cross origin isolation already specified.

* Cropping, issue 158:

* Elad: Difference between document and iframe?

* Jan-Ivar: ifame allows cross-origin capture of an embedded iframe.

* Elad: document seems like a subset of iframe

* Jan-Ivar: could be the case. They work the same with different cropping.

* Elad: would we use the same permissions for everything?

* Jan-Ivar: Same for everything, would require a content permissions

* Elad: Would we specify all at once?

* Jan-Ivar: Yes.

* Youenn: Point is that the underlying algorithm would be the same. In favour of removing document.getViewportMedia in favour of iframe

* Jan-Ivar: This is a cleaner alternative for specifying cropping than passing around handles. (no objections)

* Elad: In favour in general. Would want to crop to an arbitrary frame. This allows it assuming transferable Tracks which will take time for standardisation and widespread implementation. Would like the possibility for a stop gap before that be possible?

* Jan-Ivar: iframe.getViewportMedia() would handle this for the embedder, getTabViewportMedia for the entire viewport

* Elad: Would like to be able to capture parent even when that's not the entire frame.

* Jan-Ivar: Need to understand more about the usecase.

* Elad: Consider a frame visually split 50/50 between capturer and capturee. Would like to support capturing one from the other no matter which is the parent document

* Jan-Ivar: If you have buy-in from the parent, which will be required for permissions to be set anyway, then it can use the iframe interface.

* Elad: This requires the track to be transferable to transfer the capture

* Jan-Ivar: Can't speak to a hypothetical proposal. If Tracks don't deliver transferability, happy to reopen to add new interfaces.

* Harald: Need to move this discussion onto the issue #158.

* Wrap-up and Next Steps

* Harald: Still in favour of adopting other draft using Streams based transforms, and pursue control mechanisms independently. Without that having a full document from Youenn with details. Would like to see that.

* Bernard: Some elements of Youenn's proposal which weren't provided before, muting etc, could be applied to the streams proposal also. At a stage where each proposal could evolve, but would be good to get to a decision. Have 2 months to allow things to evolve before next meeting.

* Bernard: What does the WG feel is needed to make a decision?

* Guido: For muting, a PR for adding this to the streams proposal is being worked on. Not directly controlling the muted field, but to simulate hardware muting.

* Jan-Ivar: Would like to see Harald and Youenn work together on a proposal. Good ideas in both, would like to seen the combined. Can also volunteer to help.

* Harald: Opposed to not exposing on Window philosophically to not have differences in Window and Worker

* Jan-Ivar: We can iterate on API shape independently.

* Guido: Aim to reach consensus as much as possible and use mechanisms to resolve disagreements

* Jan-Ivar: Dependent on WebCodecs and their Window exposure. Chrome has chosen on alignment.

* Tim Panton: Half agree on exposure in workers. Only constraining if workers are a different "flavour" such as AudioWorklets. In Youenn's proposal, promises are using a lot of heavy lifting, and code will be complicated as a result. Feel we're iterating very slowly towards something that's essentially a joint proposal - having that mindset might help. Talking about how to polyfill streams ontop of callbacks and handle muting in streams feels like we're coming together on a shared set of capabilities.

* Harald: Requesting a concrete proposal document from Youenn would help.

* Tim: Can we get similar movement from Harald's contributions?

* Harald: Had a proposal being worked on for control signals, removed due to underused and under specified. If we could add a mechanism which handles these would be a win, particularly if it could have consensus.

* Bernard: Also questions about constraints, would they need to be in there?

* Harald: Wasn't covered in the slides above. A proposal which includes constraints handling would be easier to reason about.

* Guido: Proposal documents how constraints should be applied. Propagation is more difficult due to constraints being per-track. Makes sense to have notification to allow eg rejection on overconstrained.

* Jan-Ivar: As we're in agreement on needing to work well on Workers, how would we use a generated track on a page - transfer it back to window?

* Guido: Tracks remain on the window, the streams are transferred to the worker.

* Jan-Ivar: On screen capture, Elad's last comment was that he was happy, would we be ok with moving forward with a PR on this?

* Elad: Unsure in such small time.

Received on Tuesday, 10 August 2021 14:54:54 UTC