Re: Screensharing: Bootstrapping Collaboration between Capturer and Capturee from Youenn Fablet on 2021-06-15 (public-webrtc@w3.org from June 2021)

From: Youenn Fablet <youenn@apple.com>
Date: Tue, 15 Jun 2021 08:41:50 +0200
To: Harald Alvestrand <harald@alvestrand.no>
Cc: Jan-Ivar Bruaroey <jib@mozilla.com>, Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>, Elad Alon <eladalon@google.com>, WebRTC WG <public-webrtc@w3.org>
Message-id: <541C232D-0A8D-46B9-908F-0CC6974DD56B@apple.com>

> On 15 Jun 2021, at 08:07, Harald Alvestrand <harald@alvestrand.no> wrote:
> 
> Embedding the specific UA controls in the browser (proposal A and B) is a layering violation.
> 
I am not sure to understand what you mean by layering violation, can you clarify?
Media Session spec already exposes togglecamera/togglemicrophone/hangup as a way to implement video conferencing specific UA controls in the browser, https://github.com/w3c/mediasession/issues/264 <https://github.com/w3c/mediasession/issues/264>.
> The WebRTC API and the platform does not offer a presentation feature; it offers a platform on which applications can be built, some of which embed the concept of "forward slide" and "back slide".
> 
MediaSession spec defines previoustrack/nexttrack, which do not have a default handler, it is up to the web application to implement them.
The web platform does not implement these natively but applications do and the web platform acknowledges that.
This seems similar to “forward slide” and “back slide”.
> If you want to propose creating a complete framework for presentations and presentation management, and propose embedding the whole thing in the browser, you're free to do so, but that's not a WEBRTC matter (it might be a webapps matter), and I do not think it is at all appropriate to prusue such a course.
> 
MediaSession spec is currently owned by the Media WG, I agree filing an issue there seems appropriate to pursue that specific proposal.
I also think it is good to discuss and compare approaches that may partially overlap, wherever each proposal is made.
> Letting a controller component and a controlled component for an application find each other and decide to communicate seems like an appropriate thing for the platform to offer.
> 
> Trying to dictate what they talk about is not appropriate for WEBRTC.
> 
> On 6/14/21 5:16 PM, Youenn Fablet wrote:
>>> 
>>>> I really like apis when they allow untrusted parties to collaborate securely allowing more freedom to the end user and I think it is a goal  it is worth pursuing. 
>>> 
>>> Communication channels between two untrusted parties is always something to look closely at, from a security and privacy standpoint.
>>> A safe model is to have the UA in the middle: the UA triggers these actions on behalf of the user, not on behalf of the capturer.
>>> For instance, by presenting UA UI in the capturer page to control capturee through actions, similar to picture-in-picture for VC.
>>> For instance if capturee page is playing a video, it might be convenient from capturer page to pause the capturee video using the play/pause actions.
>>> A further step would be to allow capturer page to blend well with this UA UI.
>>> 
>>> So I hear 3 directions for presentation control actions:
>>> A: UA triggers these standard actions on behalf of the capturer.
>>> B: UA triggers these standard actions on behalf of the user, not on behalf of the capturer.
>>> C: Out of band between mutually participating properties only, based on id.
>>> I think both A and B sound promising. Youenn, it seems to me A might have slightly better ergonomics, so I'm curious about what risks B might mitigate.
>> 
>> A and B are not exclusive, B could be implemented and deployed faster than A.
>> I believe B is the current MediaSession approach and has the same security context.
>> 
>> A is an extension to this approach and needs additional scrutiny since now capturer, which is not as trusted as UA, is the one deciding to trigger these actions.
>> For instance, capturer might iterate through all slides while not displaying the captured live stream to the user.
>> Or multiple capturers can interact with the same capturee.
>> This is not to say it is impossible but it probably requires more work.

Received on Tuesday, 15 June 2021 06:42:26 UTC